PhD Defense by Jiaao Chen

Tuesday

June 4, 2024

11:00AM - 1:00PM

Location

Zoom

Title: Efficient and Adaptive Machine Learning for Natural Language Processing

Date: Tuesday, June 4, 2024

Time: 11:00am - 1:30pm EST

Location (Zoom): https://gatech.zoom.us/j/93873694265?pwd=RUNaNWJLclhxMzBQM3BGNmJobVZrZz09

Jiaao Chen

PhD Candidate

School of Interactive Computing

College of Computing

Georgia Institute of Technology

Committee:

Dr. Diyi Yang (advisor), Computer Science Department, Stanford University

Dr. Mark Riedl (co-advisor), School of Interactive Computing, Georgia Tech

Dr. Alan Ritter, School of Interactive Computing, Georgia Tech

Dr. Zsolt Kira, School of Interactive Computing, Georgia Tech

Dr. Colin Raffel, Department of Computer Science, University of Toronto

Abstract:

NLP has recently undergone a transformational shift towards the development and application of Large Language Models (LLMs), which represent a significant leap in performance. However, the extensive computational resources and the vast textual data sets on which they rely create key challenges in scenarios with limited resources, such as limited data, computational power, memory, and specialized expertise. I argue that developing machine learning methods that could efficiently adapt with limited resources is particularly vital in the era of LLMs to make the benefits of the technology sustainable, accessible, and generalizable.

In this thesis, I advocate for efficient and adaptive machine learning for NLP, endeavoring to make NLP models benefit real-world applications by diving into three parts. Part I tackles the essential element of NLP learning, data. I propose to improve the data efficiency for developing NLP systems by augmenting data through hidden space manipulation, linguistically informed perturbation, human learning strategies based generation and reasoning graphs. Part II deals with the information beyond the text in languages, structures. I explore what and how structures could be explicitly incorporated to enhance the NLP models. Part III focuses on the adapting process of NLP models, training. I investigate how we can efficiently fine-tune, update, and unlearn NLP models.

From improving data efficiency and incorporating structures to improving training efficiency, this thesis aims to develop efficient and adaptive machine learning for NLP to democratize NLP systems with scarce resources, specialized fields, and emerging applications.

Event Category

Other/Miscellaneous

Invited Audience

Public

Dev - Graduate Education

Office of Graduate and Postdoctoral Education

Search

Tuesday

June 4, 2024

Accessibility Information

Office of Graduate Education

Georgia Institute of Technology