Abstract

Transfer learning, where a model is first pre-trained on a data-rich task\nbefore being fine-tuned on a downstream task, has emerged as a powerful\ntechnique in natural language processing (NLP). The effectiveness of transfer\nlearning has given rise to a diversity of approaches, methodology, and\npractice. In this paper, we explore the landscape of transfer learning\ntechniques for NLP by introducing a unified framework that converts all\ntext-based language problems into a text-to-text format. Our systematic study\ncompares pre-training objectives, architectures, unlabeled data sets, transfer\napproaches, and other factors on dozens of language understanding tasks. By\ncombining the insights from our exploration with scale and our new ``Colossal\nClean Crawled Corpus'', we achieve state-of-the-art results on many benchmarks\ncovering summarization, question answering, text classification, and more. To\nfacilitate future work on transfer learning for NLP, we release our data set,\npre-trained models, and code.\n

Keywords

Automatic summarizationComputer scienceTransfer of learningNatural language processingArtificial intelligenceTransformerQuestion answeringTask (project management)Language modelNatural language understandingSet (abstract data type)Training setNatural languageEngineering

Related Publications

Publication Info

Year
2019
Type
preprint
Citations
8299
Access
Closed

External Links

Social Impact

Social media, news, blog, policy document mentions

Citation Metrics

8299
OpenAlex

Cite This

Colin Raffel, Noam Shazeer, Adam Roberts et al. (2019). Exploring the Limits of Transfer Learning with a Unified Text-to-Text\n Transformer. arXiv (Cornell University) . https://doi.org/10.48550/arxiv.1910.10683

Identifiers

DOI
10.48550/arxiv.1910.10683

Data Quality

Data completeness: 77%