Date |
Topic/Notes |
Reading |
Assignment due |
9/6 |
Introduction to RL
|
SB 1.1--1.5 |
Self Assessment (Solutions).
|
9/10 |
Bandit Problems
|
SB 2.1--2.9 |
Bandits quiz on Blackboard DUE. |
9/13 |
CLASS CANCELLED! (Rob at BoT mtg)
|
|
|
9/17 |
Bandit Problems
|
|
|
9/20 |
MDPs
|
SB 3.1--3.8 |
MDPs quiz on Blackboard DUE.
|
9/24 |
MDPs
| |
Bandits Assignment DUE
|
9/27 |
MDPs
|
|
|
10/1 |
Monte Carlo |
SB 5.1--5.7 |
Monte Carlo Quiz on blackboard DUE.
|
10/4 |
Off Policy Monte Carlo |
|
Dynamic Programming Quiz on blackboard DUE.
|
10/8 |
COLUMBUS DAY |
|
MDPs Assignment DUE; Monte carlo assignment OUT (ipynb). |
10/11 |
Dynamic Programming |
SB 4.1--4.8 |
|
10/15 |
Dynamic Programming |
|
Monte carlo assignment DUE (ipynb)
|
10/18 |
Dynamic Programming |
|
Dynamic Programming assignment OUT (ipynb) |
10/22 |
Temporal Difference Learning |
SB 6.1--6.8 |
Blackboard quiz on TD learning due. |
10/25 |
Temporal Difference Learning |
|
|
10/26 |
|
|
Dynamic Programming assignment DUE (ipynb), Project Proposal DUE |
10/29 |
Temporal Difference Learning,
Deep Learning Overview |
GBC, 6.1--6.4, 9.1--9.3 |
|
11/1 |
DQN and extensions |
Mnih, 2014 (DQN),
Hasselt, 2015 (Double DQN),
Schaul, 2016 (Prioritized Replay),
Wang, 2015 (Dueling)
Mnih, 2016 (A3C) |
Neural Networks Quiz DUE, TD Learning assignment OUT (ipynb) |
11/5 |
NO CLASS! ROB AT CONFERENCE! |
|
|
11/8 |
DQN and extensions
|
Mnih, 2014 (DQN),
Hasselt, 2015 (Double DQN),
Schaul, 2016 (Prioritized Replay),
Wang, 2015 (Dueling)
Mnih, 2016 (A3C) |
|
11/12 |
VETERANS DAY |
|
TD Learning assignment DUE (ipynb) |
11/15 |
DQN and extensions |
Mnih, 2014 (DQN),
Hasselt, 2015 (Double DQN),
Schaul, 2016 (Prioritized Replay),
Wang, 2015 (Dueling)
Mnih, 2016 (A3C) |
|
11/19 |
DQN and extensions, Linear function approximation |
|
DQN assignment OUT (no PDF here, it's just the ipynb notebook) |
11/22 |
THANKSGIVING |
|
|
11/26 |
Linear function approximation |
SB 9.1--9.5, 9.8 |
|
11/29 |
Model based RL |
SB 8.1--8.6 |
|
11/30 |
|
|
DQN assignment DUE (no PDF here, it's just the ipynb notebook). SOLUTIONS |
12/3 |
Policy gradient and actor critic |
SB 13.1--13.7, Silver, 2014 (DPG),
Lillicrap, 2016 (DDPG),
Mnih, 2016 (A3C)
|
|
12/6 |
Policy gradient and actor critic, course wrap up |
SB 13.1--13.7, Silver, 2014 (DPG),
Lillicrap, 2016 (DDPG),
Mnih, 2016 (A3C)
|
|
12/11 |
|
|
Final Project DUE |