aPaperADay

47 CANINE, Pre-training an Efficient Tokenization-Free Encoder for Language Representation

Another proposal for a Tokenizer free Model that operates on Unicode.

2021-08-13

46 ByT5, Towards a token-free future with pre-trained byte-to-byte models

ByT5 is a character level Tokenizer free model that shows similar performance to Tokenized models, while being more robust to noise and misspellings.

2021-08-10

45 Space-Time Correspondence as a Contrastive Random Walk

This paper proposes a technique to learn from raw video using a contrastive loss on an Affinity Graph constructed from frames.

2021-08-09

44 🤗 Transformers

The Huggingface Library has contributed to making NLP easier and more open.

2021-08-06

43 The Evolved Transformer

A transformer created using Neural Architecture Search with Evolutionary training

2021-08-05

42 BlenderBot, Recipes for building an open-domain chatbot

This paper looks at many complexities of building a chatbot and various ways to improve performance.

2021-08-04

41 Big Bird, Transformers for Longer Sequences

BIGBIRD proposes a unique attention by reframing the attention into a graph problem to leverage well known graph techniques.

2021-08-03

40 ELECTRA

A new pre-training task is proposed which gives a training signal to each token learned on.

2021-07-31

39 GAUSSIAN ERROR LINEAR UNITS (GELUS)

This paper introduces a new activation unit, the GELU.

2021-07-27

38 Are Sixteen Heads Really Better than One?

This paper investigates the tradeoffs to having multiple headed attention.

2021-07-23