Parameter-Efficient Transfer Learning for NLP

View Researcher's Other Codes

Disclaimer: The provided code links for this paper are external links. Science Nest has no responsibility for the accuracy, legality or content of these links. Also, by downloading this code(s), you agree to comply with the terms of use as set out by the author(s) of the code(s).

Please contact us in case of a broken link from here

Download
Authors Sylvain Gelly, Stanislaw Jastrzebski, Mona Attariyan, Andrei Giurgiu, Andrea Gesmundo, Quentin de Laroussilhe, Neil Houlsby, Bruna Morrone
Journal/Conference Name 36th International Conference on Machine Learning, ICML 2019
Paper Category
Paper Abstract Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the presence of many downstream tasks, fine-tuning is parameter inefficient an entire new model is required for every task. As an alternative, we propose transfer with adapter modules. Adapter modules yield a compact and extensible model; they add only a few trainable parameters per task, and new tasks can be added without revisiting previous ones. The parameters of the original network remain fixed, yielding a high degree of parameter sharing. To demonstrate adapter's effectiveness, we transfer the recently proposed BERT Transformer model to 26 diverse text classification tasks, including the GLUE benchmark. Adapters attain near state-of-the-art performance, whilst adding only a few parameters per task. On GLUE, we attain within 0.4% of the performance of full fine-tuning, adding only 3.6% parameters per task. By contrast, fine-tuning trains 100% of the parameters per task.
Date of publication 2019
Code Programming Language Multiple
Comment

Copyright Researcher 2022