aPaperADay
2021 / 07
40 ELECTRA
07-31-2021
39 GAUSSIAN ERROR LINEAR UNITS (GELUS)
07-27-2021
38 Are Sixteen Heads Really Better than One?
07-23-2021
37 Attention in Natural Language Processing
07-22-2021
36 SimCSE, Simple Contrastive Learning of Sentence Embeddings
07-20-2021
35 AN IMAGE IS WORTH 16X16 WORDS
07-15-2021
34 Combiner- Full Attention Transformer with Sparse Computation Cost
07-14-2021
33 MOCOv3 An Empirical Study of Training Self-Supervised Vision Transformers
07-13-2021
32 DINO Emerging Properties in Self-Supervised Vision Transformers
07-12-2021
31 BertGeneration
07-10-2021