CS6120: Natural Language Processing

Spring 2020

This is a graduate course on natural language processing. This term, there is a separate undergraduate section (CS4120).

Instructor: David Smith, Associate Professor in Khoury College of Computer Sciences (Office Hours: Fridays, 11–1, or by appointment; WVH 356)

Teaching assistants:

Class meeting: Tuesdays and Fridays, 1:35–3:15, West Village F 020

Other information, lecture notes, and questions are available on the Piazza discussion board. When you ask questions on Piazza, other students will be able to learn from the discussion.


Course Texts

We will be following, for the most part, the material in Jacob Eisenstein, Introduction to Natural Language Processing, MIT Press, 2019. You may buy it from the publisher or your favorite online bookseller, or find the author's manuscript online. Readings will be keyed to section numbers that should be consistent across print, online, and PDF versions.

A helpful resource for the programming assignments in PyTorch is Delip Rao and Brian McMahan, Natural Language Processing with PyTorch, O'Reilly, 2019. Besides the usual methods of purchase, you should be able to read it online via the Northeastern Library's subscription using a northeastern.edu email address.

We will also suggest supplemental readings in the draft third edition of Jurafsky and Martin's Speech and Language Processing (JM3).

Syllabus

This schedule is subject to change. Check back as the class progresses or consult the lecture notes on Piazza.

  1. Why NLP? (January 7)
  2. Text classification (January 10–31)
  3. Language modeling
  4. Sequence labeling
  5. Midterm, March 13
  6. Syntactic and semantic trees and graphs
  7. Compositional semantics as logical inference
  8. Distributional semantics and embeddings
  9. Information extraction: entities and relations
  10. Machine translation
  11. Final Exam, April 17

Course Policies

Examinations

There will be a midterm and a final examination. The midterm will be administered in class (13 March), will require one class period of 100 minutes, and will constitute 20% of the course grade. The final will be administered on Friday, 17 April, and constitute 20% of the course grade. Grades will be smoothed as if everyone had done well on the now-canceled final exam.

Quizzes

To help consolidaste your understanding of the material, there will be in-class quizzes approximately every other week. They will be worth only a small number of points each, mostly achievable by simply attempting the questions, and make up 10% of the course grade.

Homework

There will be five four assignments, each making up 10% 12.5% of the course grade. They will consist of implementing practical solutions for NLP tasks such as text classification, sequence tagging, etc.

Late policy: Assignments are due at the the announced due date and time, usually 11:59 p.m. You will be granted one homework extension of four calendar days, to be used at your discretion, without having to ask. This single extension is meant to smooth over unforeseen crunches in your schedule, and you cannot simply distribute the four late days among four assignments. After the first late assignment, unexcused late assignments will be penalized 20% per 24-hour period late. We normally will not accept assignments after the date on which the following assignment is due or after the solutions have been handed out, whichever comes first. If you know in advance of circumstances that would cause you to turn in an assignment late, please contact the instructor before the assignment is due to ask if an extension is possible.

Academic Integrity

All work submitted for credit must be your own.

You may discuss the homework assignments with your classmates, the TAs, and the instructor. You must acknowledge the people with whom you discussed your work, and you must write up your own code and solutions.

Any written sources used (apart from the text) must also be acknowledged; however, you may not consult any solutions from previous years' assignments whether they are student- or faculty-generated.

Accomodations for Students with Disabilities

If you have a disability-related need for reasonable academic accommodations in this course and have not yet met with a Disability Specialist, please visit www.northeastern.edu/drc and follow the outlined procedure to request services.

If the Disability Resource Center has formally approved you for an academic accommodation in this class, please present the instructor with your “Professor Notification Letter” during the first two weeks of the semester, so that we can address your specific needs as early as possible. You should also feel free to drop by the instructor's office hours to discuss your concerns about the course.