the tokenizer's special tokens do: [CLS], [SEP], [PAD], [MASK] have specific roles

[CLS] marks the start of input, [SEP] denotes separation, [PAD] fills space, [MASK] hides words for prediction

Image: William Blake, No restrictions, via Wikimedia Commons

the tokenizer's special tokens do: [CLS], [SEP], [PAD], [MASK] have specific roles

[CLS] marks the start of input, [SEP] denotes separation, [PAD] fills space, [MASK] hides words for prediction

Related concepts

One email a day: 5 concepts + the 5 stories that matter →

Swipe through 100 ML concepts daily

Open TickerNews