CS 6240: Parallel Data Processing in MapReduce

This course covers techniques for analyzing very large data sets. We introduce the MapReduce programming model and the core technologies it relies on in practice, such as a distributed file system. Related approaches and technologies from distributed databases and Cloud Computing will also be introduced. Particular emphasis is placed on practical examples and hands-on programming experience. Both plain MapReduce and database-inspired advanced programming models running on top of a MapReduce infrastructure will be used.


News

We will use Piazza for online course discussions. Please sign up at piazza.com/northeastern/fall2016/cs6240


This course is managed through Blackboard (nuonline.neu.edu). If you are registered, just go there to see all course material. If you are not registered, but want to get an idea about the course, please take a look at the syllabus.