Improving Language Understanding by Generative Pre-Training
摘要
This paper introduced the first Generative Pre-trained Transformer (GPT-1), demonstrating that unsupervised pre-training on a large corpus followed by supervised fine-tuning could achieve strong performance on various NLP tasks. This work established the foundation for the GPT series.