Spring 2026
LING 4431: LLMs for Computational Linguistics [ Syllabus GU360 Site ]
Course Description: Large language models (LLMs) are the foundational technology behind today’s most advanced artificial intelligence systems. They have revolutionized the field of natural language processing and challenged traditional ideas about how language is represented in the human mind. This course offers an introduction to LLMs from the perspective of computational linguistics. Through lectures and hands-on demonstrations, we will explore three interrelated questions: How do LLMs work at a technical level? How can they be used to process natural language data? And how can they be used to model human linguistic cognition? The first half of the course will cover technical foundations of LLMs, including the transformer architecture, tokenization, interpretability techniques, and scaling. The second half of the course will focus on LLMs’ applications in natural language processing, linguistics, and cognitive science. Topics will include security and privacy, ethical issues, multilingualism, and LMs as models of human language processing and acquisition. This course is appropriate for advanced undergraduates and graduate students. Students will gain experience training (small) language models, implementing basic interpretability techniques, and reading recent research papers in the area. Knowledge of programming in Python, and basic math and statistics, are required as a prerequisite.

LING 4400/4020: Computational Language Processing [ Syllabus GU360 Site ]
Course Description: This course will introduce students to the basics of Natural Language Processing (NLP), a field that combines linguistics and computer science to produce applications, such as generative AI, that are profoundly impacting our society. We will cover a range of topics that form the basis of these exciting technological advances and will provide students with a platform for future study and research in this area. We will learn to implement simple representations such as finite-state techniques, n-gram models and basic parsing in the Python programming language. Previous knowledge of Python is not required, but students should be prepared to invest the necessary time and effort to become proficient over the course of the semester. Students who take this course will gain a thorough understanding of the fundamental methods used in natural language processing, along with an ability to assess the strengths and weaknesses of natural language technologies based on these methods.
Fall 2025
LING 4480: Computational Linguistics Research Methods [ Draft Syllabus GU360 Site ]
Course Description: Computational Linguistics is a fast-growing and fast-moving field. This course is intended to give advanced undergraduate and graduate students practice conducting original research in computational linguistics and to enhance their research and communication skills. It will serve as a platform for students to pursue an independent research project with guidance and oversight from faculty and peers. Students will be expected to bring their own pre-existing research topics/questions to the class. Over the course of the semester, they will select and present key research papers pertinent to their topic, and develop the project with the goal of writing an ACL-style conference proceedings paper. In addition to hands-on research, this class will provide a venue for students to learn CL-related skills that often fall through the cracks of other, content-focused courses. Possible workshop topics include data annotation, LaTeX, and working with pretrained language models, as well as communication skills such as poster and slide design. As a hands-on course whose content changes based on the instructor and students, this course can be repeated for credit.
Spring 2025
LING 4400: Introduction to Natural Language Processing [ Syllabus GU360 Site ]
Course Description: See LING 4400/4020 (Computational Language Processing) above
Fall 2024
LING 4400: Introduction to Natural Language Processing [ Syllabus GU360 Site ]
Course Description: See above

LING 8430: Information, Structure, and Language [ Syllabus GU360 Site ]
Course Description: This seminar brings together two divergent perspectives on human language. On one hand, linguistics research seeks to describe the structures that underlie human communication systems, often using formal tools such as grammars and logics. On the other hand, research in computer science, in particular information theory, seeks to discover the optimal way to package and transmit information over a channel. This seminar will focus on the intersection between these two programs: To what extent are human languages optimized for efficient communication? Can structural features of human language, or human linguistic behaviors be analyzed using the toolkit developed for efficient information exchange? Topics covered will include the structure of the lexicon, the relationship between syntactic and statistical dependencies, pragmatic inferences, as well as various language processing phenomena. Students will gain experience reading and presenting research papers in this area, and implementing concepts from information theory in code. Students should be proficient in at least one programming language (Python or R), and familiar with basic concepts of probability theory and/or machine learning.
Experience as a TA
• Quantitative Methods in Linguistics (Harvard, Spring 2020)
• Computational Psycholinguistics (MIT, Spring 2020)
• Introduction to Linguistics (Harvard, Fall 2020)