Time and Place: Monday, Thursday 11:45am - 1:25pm, Snell Library 033

College of Computer and Information Science

Instructor: Robert Platt

This course will introduce the student to reinforcement learning. The bulk of what we will cover comes straight from the second edition of Sutton and Barto's book, *Reinforcement Learning: An Introduction*. However, we will also cover additional material drawn from the latest deep RL literature. The following list of topics is subject to revisions.

- Problem formulation
- The reinforcement learning problem, anatomy of an RL agent
- Bandit problems
- Markov decision processes
- Value function methods
- Monte Carlo
- Dynamic Programming
- Temporal Difference Learning
- Combining planning and learning
- Batch learning
- Function Approximation
- Linear function approximation and LSTD
- Deep neural networks and DQN
- When does function approximation succeed and fail?
- Policy Gradient Methods
- Policy gradient formulation
- REINFORCE algorithm
- Actor/critic methods
- DDPG, TRPO

Reinforcment Learning: An Introduction, 2nd Ed., Sutton and Barto

**Python 2.7, OpenAI Gym, Tensorflow:**You must be able to write code in Python 2.7 and to install OpenAI Gym and Tensorflow. Ideally, you will do this on a Linux installation such as Ubuntu, but you may use other OSes at your own risk.**Probability:**You should be comfortable with all of the concepts described here.**Algorithms:**You need to have taken or be taking a course in algorithms such as CS 5800 or CS 7800. If you do not have this, but would still like to take the course, please contact the instructor.**Linear Algebra:**You should be comfortable with all of the concepts described here.**GPU:**It would help to have access to a cuda-enabled GPU (with the installed drivers), but it's not strictly necessary for the course.

Cheating and other acts of academic dishonesty will be referred to OSCCR (office of student conduct and conflict resolution) and the College of Computer Science. See this link.

Late assignments will be penalized by 10% for each day late. For example, if you turned in a perfect programming assignment two days late, you would receive an 80% instead of 100%.

Primary Instructor: Robert Platt ( r [dot] platt [at] neu [dot] edu )

Office hours: Thursday 3--4pm, 526 ISEC, or by Appt.

TA: Colin Kohler, kohler.c@husky.neu.edu

Office hours: Weds 3--4pm, Fri 11am--noon, 505 ISEC

Our Piazza page is here.

Required course work is:

- Approximately weekly problem sets and/or programming assignments (55% of your grade)
- 1 Final project (40% of your grade)
- Frequent quizzes on blackboard (5% of your grade, each quiz graded pass/fail)

Many of the weekly assignments will involve programming in Python. You must be proficient with python and able to install tensorflow on your computer.

We will assign approximately one problem set each Thursday that will be due on the following Tuesday.

The final project assignment can be found here. The final project can be on any topic related to RL. Students may work alone or in groups of two. Many people choose to work on a project applying a method studied in the class to some practical problem. The amount of project work should be equivalent to approximately three programming assignments.

We're using git. You should follow the instructions outlined here.