CS 980-01: Planning under uncertainty for single and multi-robot systems

Who: Professor Amato
When: Tuesday, Thursday 9:40 am - 11:00 am
Where: Kingsbury N233 (No longer in S145!!!)


In this class, students will learn about the state of the art in autonomously generating solutions to planning problems under uncertainty (e.g., optimally controlling robots or solving games).  We will concentrate on methods where there is partial information (e.g., noisy senors or other limited information) and discuss the different methods that can be used when there is one decision-maker (or "agent") versus a set of competing or cooperative agents. The models considered will include different types of Markov decision processes (MDPs) as well as other methods for multi-agent coordination (e.g., auctions and rule-based systems). We will be reading recent research papers that describe and compare the various methods as well as a number of application papers. Although we will focus on robotics, we will read about other application domains such as video games and networking.

Reading responses: For each paper, students should write a short (<1 page) response. See "how to read a paper" for guidelines and how to structure your responses.

Projects: Students will write research papers on a topic of their choosing. Topics can be related to any of the material we cover in class and be theoretical or experimental. More details on projects will be discussed soon.

Students leading the discussion: Later in the semester, students will each choose a paper to present and will lead the discussion for that class. Papers must be relevant to the material covered in this class and cleared by the date below.

Prerequisites: Previous exposure to AI planning is helpful, but not necessary


Date Assignment Extra material
1/20 Overview
1/22 Chap. 3 of Reinforcement Learning: An Introduction by Sutton and Barto
No response needed!
1/27 No class! (send in answers to exercises 17.1-4 & 17.9)
17.1-17.3 of AI a Modern Approach 3rd edition
1/29 No class! (send in answers to exercises 17.13-15) 17.4 of AI a Modern Approach 3rd edition
2/3 No class!
2/5 High-level Reinforcement Learning in Strategy Games by Amato and Shani Robot skill learning video
We will also go over answers to exercises
2/10 Planning and Acting in Partially Observable Stochastic Domains by Kaelbling, Littman and Cassandra (only up to 4.4!) POMDPs for dummies
2/12 (1) Point-based value iteration: An anytime algorithm for POMDPs by Pineau, Gordon and Thrun
(2)SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces by Kurniawati, Hsu and Lee

2/17 (1) Collision Avoidance for Unmanned Aircraft using Markov Decision Processes by Temizer, Kochenderfer, Lozano-Perez and Kaelbling
(2) Unmanned Aircraft Collision Avoidance using Continuous-State POMDPs by Bai, Hsu, Kochenderfer and Lee

2/19 Monte-Carlo Planning in Large POMDPs by Silver and Veness POMCP pac-man video
2/24 (1)Experiences with a Mobile Robotic Guide for the Elderly by Montemerlo, Pineau, Roy, Thrun and Verma
(2)Spoken Dialogue Management Using Probabilistic Reasoning by Roy, Pineau and Thrun

2/26 Relatively Robust Grasping by Hsiao, Lozano-Perez and Kaelbling

3/3 The Belief Roadmap: Efficient Planning in Linear POMDPs by Factoring the Covariance by Prentice and Roy

3/5 Planning How to Learn by Bai, Hsu and Lee

3/10 Monte Carlo Value Iteration with Macro-Actions by Lim, Hsu and Lee

Choose papers to present
3/12 DESPOT: Online POMDP Planning with Regularization by Somani, Lim, Hsu and Lee
3/17 Spring Break!
3/19 Spring Break!
3/24 Efficient planning in non-Gaussian belief spaces and its application to robot grasping by Platt, Kaelbling, Lozano-Perez and Tedrake

Project topics due
3/26 Monte Carlo Bayesian Reinforcement Learning by Wang, Won, Hsu, Lee (Luke Jablonski)
3/31 Using POMDPs to Control an Accuracy-Processing Time Tradeoff in Video Surveillance by Kapoor, Amato, Srivastava and Schrater (Mark Kelley)
4/2 A POMDP Approach to Optimizing P300 Speller BCI Paradigm by Park and Kim (Dan Shea)

4/7 Supporting Search and Rescue Operations with UAVs by Waharte and Trigoni (Eliza Hunt-Hawkins)

4/9 Inverse Reinforcement Learning in Partially Observable Environments by Choi and Kim
4/14 Planning for Decentralized Control of Multiple Robots Under Uncertainty by Amato, Konidaris, Cruz, Maynor, How and Kaelbling
Project status reports
4/16 Point-Based POMDP Solving with Factored Value Function Approximation by Veiga, Spaan and Lima
4/21 Decentralized cognitive MAC for opportunistic spectrum access in ad hoc networks: A POMDP framework by Zhao, Tong, Swami and Chen

4/23 An MDP-Based Recommender System by Shani, Brafman and Heckerman
4/28 Project presentations (Mark and Dan)

4/30 Project presentations (Luke and Eliza)

AAAI format
Final paper due