Course description

Natural Language Processing (NLP) is the engineering art and science of how to teach computers to understand human language. NLP is a type of artificial intelligence technology, and it’s now ubiquitous – NLP lets us talk to our phones, use the web to answer questions, map out discussions in books and social media, and even translate between human languages. Since language is rich, subtle, ambiguous, and very difficult for computers to understand, these systems can sometimes seem like magic – but these are engineering problems we can tackle with data, math, machine learning, and insights from linguistics.

This course will introduce NLP methods and applications including probabilistic language models, machine translation, and parsing algorithms for syntax and the deeper meaning of text. During the course, students will (1) learn and derive mathematical models and algorithms for NLP; (2) become familiar with key facts about human language that motivate them, and help practitioners know what problems are possible to solve; and (3) complete a series of hands-on projects to implement, experiment with, and improve NLP models, gaining practical skills for natural language systems engineering.

This course is intended for upper-level CS undergraduates and graduate students, as well as linguistics students with an appropriate background.

Prerequisites: experience in programming and probability. Undergraduates must have completed: ((CS220 or CS230) and CS240) or Ling492B).

Override waitlist

If you need to sign up on the override waitlist, make sure to do it on the CS department’s form at: www.cs.umass.edu/overrides. The course has been full ever since signups opened, and even after expanding it to a new room there was still a waitlist. Make sure to carefully describe your reason to take the course. Since demand is high we will not be able to accomodate everyone who wants to enroll.

Bird et al, NLP with Python, a.k.a. the NLTK book. Free and very introductory. NLTK’s prepackaged NLP tools are usually not state-of-the-art, but it has easy to use interfaces to data and resources, and the book is a good gentle introduction to NLP with a CL (computational linguistics) emphasis.