Return to basic course information.

This schedule is subject to change. Check back as the class progresses.

- n-gram models, naive Bayes classifiers, probability, estimation
- We also played the “Shannon game”, guessing the
next letter from the previous
*n*letters. If you want background, see the Shannon's paper from 1950. **Reading for Jan. 17:**Bo Pang, Lillian Lee, and Shivakumar Vaithyanathan. Thumbs up? Sentiment Classification using Machine Learning Techniques. EMNLP, 2002.**Background:**Jurafsky & Martin, chapter 4

- history of NLP research, the Chomsky hierarchy, regular expressions, (weighted) finite-state automata and transducers
**Reading for Jan. 24:**Karttunen, Chanod, Grefenstette, and Schiller. Regular expressions for language engineering. Journal of Natural Language Engineering, 1997.**Background:**Jurafsky & Martin, chapter 2**Reading for Jan. 31:**Okan Kolak, William Byrne, and Philip Resnik. A Generative Probabilistic OCR Model for NLP Applications. In HLT-NAACL, 2003.

Noisy Channel and Hidden Markov Models

- noisy channel models with finite state transducer; part-of-speech tagging; hidden Markov models as noisy channel models; Viterbi and Forward-Backward algorithms; parameter estimation with supervised maximum likelihood and expectation maximization
**Background:**Jurafsky & Martin, chapter 5**Reading for Feb. 14:**Bikel, Schwartz, and Weischedel. An Algorithm that Learns What's in a Name. Machine Learning, 34(1–3), 1999.

Context-Free Grammars and Parsers

**Reading for Feb. 28:**Dan Klein and Christopher D. Manning. Accurate Unlexicalized Parsing. ACL, 2003.

- also known as: logistic regression, and maximum entropy (maxent) models; directly modeling the conditional probability if output given input, rather than the joint probability of input and output (and then using Bayes rule)
**Background:**Jurafsky & Martin, chapter 6

Log-Linear Models with Structured Outputs

- models that decide among combinatorially many outputs, e.g. sequences of tags or dependency links; locally normalized (action-based) models such as Maximum Entropy Markov Models (MEMMs); globally normalized models such as linear-chain Conditional Random Fields (CRFs)

- logical form: lambda expressions, event semantics, quantifiers, intensional semantics; computational semantics: semantic role labeling, combinatory categorial grammar (CCG), tree adjoining grammar (TAG); lexical semantics: vector space representations, greedy agglomerative clustering, k-means and EM clustering; learning hyper(o)nym relations for nouns and verbs
**Background:**Jurafsky & Martin, chapters 18-20; see also NLTK book, chapter 10