DS 4400: Machine Learning and Data Mining I

Fall 2020

Class Information

Calendar

Additional Reading

Other Resources

 

Instructors:

  • Instructor: Alina Oprea (alinao)
  • TAs: Matthew Jagielski; Alex Wang

Class Schedule:

·      Tuesday 11:45am-1:25pm; Thursday 2:50-4:30pm EST

·      Location: Remote on Zoom (See Canvas for link)

Office Hours:

·      Alina:

o   Tuesday, 4:00-5:30pm EST, Remote on Zoom

o   Thursday, 4:30-5:30pm EST, Remote on Zoom

·      Alex: Wednesday 5:00-7:00pm EST

·      Matthew: Monday 3:00-4:00pm, Friday 9:00-10:00am EST

Class forum:  Piazza (See Canvas for link)

Class policies:  Academic integrity policy is strictly enforced.

 

Class description:

Machine learning is a fast-pacing and exciting field achieving human-level performance in tasks such as image classification, speech recognition. machine translation, precision medicine, and self-driving cars. Machine learning has already impacted greatly our daily lives and has the potential to transform the world even more in the near future. This course will provide a broad introduction to machine learning and cover the fundamental algorithms for supervised learning. We will cover topics related to regression, linear classification, non-linear classification, ensemble models, and deep learning. The class will also provide an introduction into ethics and fairness concerns of machine learning, as well as adversarial machine learning, an emerging area that studies the fundamental security issues of machine learning.   

 

Pre-requisites:

·      Probability

·      Statistics

·      Linear algebra

 

Textbook

[ISL] Gareth JamesDaniela WittenTrevor Hastie and Robert Tibshirani. An Introduction to Statistical Learning with Applications in R.

Grading

The grade will be based on:

 

-       Assignments – 25%

-       Final project report and presentation – 35%

-       Exam – 35%

-       Class participation – 5%

 

 Calendar (Tentative)

 

Unit

Week

Date

Topic

Readings

Introduction

1

Thu

09/10

Course outline (syllabus, grading, policies)

[Slides]

[ISL] Chapters 1 and 2.1

Review

 

Tue

09/15

Classification and regression

Bias-variance tradeoff

[Slides] [Annotations]

[ISL] Chapters 2.2.1 and 2.2.2

Probability review from Stanford

Linear regression

2

Thu

09/17

Probability and linear algebra review.

[Slides] [Annotations]

Linear algebra review from Stanford

HW 1 out Sept 17

Tue

09/22

Simple linear regression

Closed from solution. Correlation

[Slides] [Annotations] [Notes]

[ISL] Chapter 3.1

 

3

Thu

09/24

Multiple linear regression

Closed form solution

[Slides] [Annotations] [Notes]

[ISL] Chapter 3.2

Tue

09/29

Multiple linear regression

Gradient descent for linear regression

[Slides] [Annotations]

Linear regression notes from Stanford, Part I.1 (LMS Algorithm)

HW 1 due Sept. 28

HW 2 out Sept. 29

 

Regularization and cross-validation

4

Thu

10/01

Regularization.

Lasso and ridge regression

k-Nearest Neighbors (kNN).

[Slides] [Annotations]

[ISL] Chapter 6.2

 

Tue

10/06

Cross-validation

Linear classification. Perceptron.

[Slides] [Annotations]

[ISL] Chapter 5.1

[ISL] Chapter 4.1, 4.2, and 4.3

 

 

Linear   Classification

5

Thu

10/08

Logistic regression

Maximum likelihood estimation

[Slides] [Annotations]

 

Logistic regression notes from Stanford, Part II

Tue

10/13

Logistic regression

Evaluation of ML, metrics

[Slides] [Annotations] [Notes]

[ISL]

HW 2 due Oct. 13

 

6

Thu

10/15

Project discussion

Evaluation of ML

ROC curves

[Slides] [Annotations]

 

Tue

10/20

LDA

Naïve Bayes

[Slides] [Annotations]

Chapter 4.4 for LDA

Lecture notes from Cornell for Naïve Bayes

Generative Models

7

Thu

10/22

Naïve Bayes

Decision trees

[Slides] [Annotations]

Chapter 8.1.2

 

Tue

10/27

Decision trees

Ensemble learning

[Slides] [Annotations]

Tree and Ensemble Classification

8

Thu

10/29

Bagging, Random forest

[Slides] [Annotations]

Chapter 8.2

HW3 due Oct 29

 

 

9

Tue

11/03

Boosting, AdaBoost

[Slides] [Annotations]

Project proposal due Nov 2

HW4 out

 

 

Thu

11/05

Neural networks and deep learning.

Feed-Forward Networks.

[Slides] [Annotations]

Deep learning notes from Stanford, Part 2

Optional: Chapter 4 from Dive into Deep Learning

Deep learning

 

 

10

Tue

11/10

Feed-Forward Networks

Multi-class classification.

Keras tutorial.

[Slides]

Optional: Chapter 6 from Dive into Deep Learning

 

Thu

11/12

Convolutional neural networks.

Review and exam preparation

[Slides] [Annotations]

Review and Exam

11

Tue

11/17

Backpropagation

[Slides] [Annotations]

HW4 due on Nov. 21

 

 

Thu

11/19

Exam

 

 

12

Tue

11/24

Regularization in Neural Networks

Stochastic Gradient Descent

Transfer Learning

[Slides] [Annotations]

Project milestone due Nov. 25

Ethics of ML and

Thu

11/26

Thanksgiving break

Adversarial ML

13

Tue

12/01

Ethics of ML/AI

[Slides]

 

Thu

12/03

Adversarial machine learning

[Slides]

 

 

14

Tue

12/08

Project presentations 1

 

 

 

Wed

12/09

12-2pm

Project presentations 2

Project report due Dec. 15

 

 

Additional reading

 

 

 

Other resources

 

Books: