Teaching and Advising Experience
Courses Taught
Northeastern U. CS 6240: Parallel Data Processing in MapReduce
-
F11,
F12,
S13
- Graduate course. This course covers techniques for
analyzing very large data sets. We introduce the MapReduce programming model
and the core technologies it relies on in practice, such as a distributed
file system. Related approaches and technologies from distributed databases
and Cloud Computing will also be introduced. Particular emphasis is placed
on practical examples and hands-on programming experience. Both plain
MapReduce and database-inspired advanced programming models running on top
of a MapReduce infrastructure will be used.
Northeastern U. CS 6220: Data Mining Techniques
-
F09,
S10,
S11,
S12
- Graduate course. This course covers various aspects of data mining including data
preprocessing, classification, ensemble methods, association rule mining, sequence
mining, and cluster analysis. The class project involves hands-on practice
of mining useful knowledge from a large database.
Northeastern U. CS 3200: Database Design
- F10,
S12,
F12
- Upper division undergraduate course. This course studies the design of relational databases, including the
entity-relationship model, normalization, relational algebra, SQL, triggers,
stored procedures, indexing, elementary query optimization, and fundamentals
of concurrency and recovery. The class project involves working with a
commercial relational database management system and accessing it from an
application.
Northeastern U.
CSG 339: Scalable Techniques for Massive Data
(Spring 2009)
- Graduate course. We discuss influential and cutting edge research papers from academia
and industry research groups. The course also has a project requirement
where students can choose a research project related to large-scale data
analysis.
Cornell U.
CS/INFO 330: Data-Driven Web Applications (Fall 2007)
- Upper division undergraduate course. CS/INFO 330 is taken by third and fourth-year undergraduate students. It
is offered jointly by the Computer Science department and the Information
Science Program. This course introduces students to modern database systems
and three-tier application development with a focus on building web-based
applications using database systems. Concepts covered include the relational
model and query languages, data modeling, normalization, three-tier
architectures, Internet data formats and query languages, server- and
client-side technologies, and an introduction to web services. Students also
build a database-backed website with Java EE technology.
Graduate Student Advising
Current Students
- Yue Huang (Ph.D. student)
- Alper Okcan
(Ph.D. student)
- Bahar Qarabaqi (Ph.D. student)
- Baturalp Torun (MS student)
Previous Students (co-advised with faculty at Cornell)
- Biswanath Panda (Ph.D. 2009, first employment: Google)
- Mingsheng Hong (Ph.D. 2008, first employment: Vertica)
- Daria Sorokina (Ph.D. 2008, first employment: PostDoc at
CMU)
- Abhinandan Das (Ph.D. 2005, first employment: Google)
- Tulika Chandra (MEng 2004, first employment: Siemens)