aPaperADay
47 CANINE, Pre-training an Efficient Tokenization-Free Encoder for Language Representation
Another proposal for a Tokenizer free Model that operates on Unicode.
2021-08-13
46 ByT5, Towards a token-free future with pre-trained byte-to-byte models
ByT5 is a character level Tokenizer free model that shows similar performance to Tokenized models, while being more robust to noise and misspellings.
2021-08-10
45 Space-Time Correspondence as a Contrastive Random Walk
This paper proposes a technique to learn from raw video using a contrastive loss on an Affinity Graph constructed from frames.
2021-08-09
44 🤗 Transformers
The Huggingface Library has contributed to making NLP easier and more open.
2021-08-06
43 The Evolved Transformer
A transformer created using Neural Architecture Search with Evolutionary training
2021-08-05
42 BlenderBot, Recipes for building an open-domain chatbot
This paper looks at many complexities of building a chatbot and various ways to improve performance.
2021-08-04
41 Big Bird, Transformers for Longer Sequences
BIGBIRD proposes a unique attention by reframing the attention into a graph problem to leverage well known graph techniques.
2021-08-03
40 ELECTRA
A new pre-training task is proposed which gives a training signal to each token learned on.
2021-07-31
39 GAUSSIAN ERROR LINEAR UNITS (GELUS)
This paper introduces a new activation unit, the GELU.
2021-07-27
38 Are Sixteen Heads Really Better than One?
This paper investigates the tradeoffs to having multiple headed attention.
2021-07-23