Title: Efficient and Adaptive Machine Learning for Natural Language Processing
Date: Tuesday, June 4, 2024
Time: 11:00am - 1:30pm EST
Location (Zoom): https://gatech.zoom.us/j/93873694265?pwd=RUNaNWJLclhxMzBQM3BGNmJobVZrZz09
PhD Candidate
School of Interactive Computing
College of Computing
Georgia Institute of Technology
Committee:
Dr. Diyi Yang (advisor), Computer Science Department, Stanford University
Dr. Mark Riedl (co-advisor), School of Interactive Computing, Georgia Tech
Dr. Alan Ritter, School of Interactive Computing, Georgia Tech
Dr. Zsolt Kira, School of Interactive Computing, Georgia Tech
Dr. Colin Raffel, Department of Computer Science, University of Toronto
Abstract:
NLP has recently undergone a transformational shift towards the development and application of Large Language Models (LLMs), which represent a significant leap in performance. However, the extensive computational resources and the vast textual data sets on which they rely create key challenges in scenarios with limited resources, such as limited data, computational power, memory, and specialized expertise. I argue that developing machine learning methods that could efficiently adapt with limited resources is particularly vital in the era of LLMs to make the benefits of the technology sustainable, accessible, and generalizable.
In this thesis, I advocate for efficient and adaptive machine learning for NLP, endeavoring to make NLP models benefit real-world applications by diving into three parts. Part I tackles the essential element of NLP learning, data. I propose to improve the data efficiency for developing NLP systems by augmenting data through hidden space manipulation, linguistically informed perturbation, human learning strategies based generation and reasoning graphs. Part II deals with the information beyond the text in languages, structures. I explore what and how structures could be explicitly incorporated to enhance the NLP models. Part III focuses on the adapting process of NLP models, training. I investigate how we can efficiently fine-tune, update, and unlearn NLP models.
From improving data efficiency and incorporating structures to improving training efficiency, this thesis aims to develop efficient and adaptive machine learning for NLP to democratize NLP systems with scarce resources, specialized fields, and emerging applications.