Improving Language Understanding by Generative Pre-Training

Alec Radford, Karthik Narasimhan, Tim Salimans, Ilya Sutskever | 2018年01月01日

摘要

This paper introduced the first Generative Pre-trained Transformer (GPT-1), demonstrating that unsupervised pre-training on a large corpus followed by supervised fine-tuning could achieve strong performance on various NLP tasks. This work established the foundation for the GPT series.

📄 论文链接