Brief Course Description

This course will introduce the student to reinforcement learning. The bulk of what we will cover comes straight from the second edition of Sutton and Barto's book, Reinforcement Learning: An Introduction. However, we will also cover additional material drawn from the latest deep RL literature. The following list of topics is subject to revisions.

  1. Problem formulation
    • The reinforcement learning problem, anatomy of an RL agent
    • Bandit problems
    • Markov decision processes
  2. Value function methods
    • Monte Carlo
    • Dynamic Programming
    • Temporal Difference Learning
    • Combining planning and learning
    • Batch learning
  3. Function Approximation
    • Linear function approximation and LSTD
    • Deep neural networks and DQN
    • When does function approximation succeed and fail?
  4. Policy Gradient Methods
    • Policy gradient formulation
    • REINFORCE algorithm
    • Actor/critic methods
    • DDPG, TRPO

The course schedule is subject to change. See the schedule tab above.


Reinforcment Learning: An Introduction, 2nd Ed., Sutton and Barto


  1. Python 2.7, OpenAI Gym, Tensorflow: You must be able to write code in Python 2.7 and to install OpenAI Gym and Tensorflow. Ideally, you will do this on a Linux installation such as Ubuntu, but you may use other OSes at your own risk.
  2. Probability: You should be comfortable with all of the concepts described here.
  3. Algorithms: You need to have taken or be taking a course in algorithms such as CS 5800 or CS 7800. If you do not have this, but would still like to take the course, please contact the instructor.
  4. Linear Algebra: You should be comfortable with all of the concepts described here.
  5. GPU: It would help to have access to a cuda-enabled GPU (with the installed drivers), but it's not strictly necessary for the course.

Academic Integrity

Cheating and other acts of academic dishonesty will be referred to OSCCR (office of student conduct and conflict resolution) and the College of Computer Science. See this link.

Lateness Policy

Late assignments will be penalized by 10% for each day late. For example, if you turned in a perfect programming assignment two days late, you would receive an 80% instead of 100%.

Instruction Staff

Primary Instructor: Robert Platt ( r [dot] platt [at] neu [dot] edu )
Office hours: Thursday 3--4pm, 526 ISEC, or by Appt.

TA: Colin Kohler,
Office hours: Weds 3--4pm, Fri 11am--noon, 505 ISEC


Our Piazza page is here.

Work Load

Required course work is:

  • Approximately weekly problem sets and/or programming assignments (55% of your grade)
  • 1 Final project (40% of your grade)
  • Frequent quizzes on blackboard (5% of your grade, each quiz graded pass/fail)

Programming assignments

Many of the weekly assignments will involve programming in Python. You must be proficient with python and able to install tensorflow on your computer.

Problem sets.

We will assign approximately one problem set each Thursday that will be due on the following Tuesday.

Final project

The final project assignment can be found here. The final project can be on any topic related to RL. Students may work alone or in groups of two. Many people choose to work on a project applying a method studied in the class to some practical problem. The amount of project work should be equivalent to approximately three programming assignments.

How to turn in the programming assignments

We're using git. You should follow the instructions outlined here.