Bidirectional Encoder Representations from Transformers (BERT)

https://d2l.ai/chapter_natural-language-processing-pretraining/bert.html