Mirek Riedewald

PhotoAssociate Professor
Northeastern University

College of Computer and Information Science, 202 West Village H
360 Huntington Avenue
Boston, MA 02115

phone +1-617-373 4766, fax (dept): +1-617-373 5121

2002 Ph.D. (UC Santa Barbara)
2002-2008 Research Associate (Cornell University)
Since 2009 Associate Professor (Northeastern University)


Research

Big Data; database systems, with an emphasis on large-scale distributed data analysis and data management and data mining for the sciences.

Publications

Research Group Meetings

Current Projects

Scolopax: Making Analysis of Scientific Data Fast and EasyScolopax logo

I have been collaborating with scientists from various disciplines since 1999. While specific challenges vary, there is always the same common theme: scientists are collecting and generating an ever rapidly increasing amount of data. In this new world of data-driven science, groundbreaking discoveries depend on the ability to efficiently analyze and process these massive amounts of data. To let scientists do science, not force them to become experts on parallel algorithms, data mining, and databases, we are developing Scolopax. Scolopax is a tool for scientific discovery. It will support a user-friendly interface for declaratively specifying discovery goals. All data processing will then be optimized automatically for fast and efficient execution on multiple processors, relying on novel data management techniques.

Merlin: Interactive Category Identification

Consider a citizen scientist or casual observer who spots an interesting bird. Later at home, she wants to know the species of this bird. Despite availability of excellent bird guides, this often becomes a tedious process. Traditional classification techniques are not effective due to the nature of the problem, including having to deal with wrong and uncertain user inputs. Similar problems occur in many other contexts. We are developing novel interactive category identification techniques whose goal is to minimize user effort. Merlin is part of a major inter-institutional collaboration led by the Cornell Lab of Ornithology. The overall goal is to build a social networking site that connects citizen scientists, bird experts, and ecology researchers. Users can contribute data, explore birds, interact with others to learn more about ecology, and play online "games with a purpose". This system will broaden interest in (citizen) science and contribute to science education. (Recently started. More information coming soon.)

Selected Past Projects

Cayuga: A Scalable System for Data Stream Processing

Additive Groves Prediction Technique and Automatic Interaction Detection


Teaching and Advising


Selected Professional Activities

Recent Talkstypewriter

Interactive Search Queries for Online Communities (UC Berkeley, April 2012)
Scolopax: Supporting Exploratory Analysis of Scientific Data (University of Wisconsin, Madison, March 2012)
Scolopax: Supporting Exploratory Analysis of Scientific Data (MIT, February 2012)
Scolopax: Supporting Exploratory Analysis of Scientific Data (Brown University, February 2012)
Scolopax: Supporting Exploratory Analysis of Scientific Data (UPenn, Philadelphia, November 2011)
Near-Optimal Parallel Join Processing in MapReduce (Yahoo! Research; Google; IBM Almaden Research Lab; May 2011)

Professional Service (most recent)

Program Committee area vice chair for data warehousing, statistics, aggregate processing for the 2014 IEEE Int. Conf. on Data Engineering (ICDE)
Program Co-Chair for the 2014 International Workshop on Exploratory Search in Databases and the Web (ExploreDB)
Program Committee Chair for the 2013 New England Database Summit (NEDSummit)
Co-Chair of the program committee of the demo track for the 2012 IEEE Int. Conf. on Data Engineering (ICDE)

Member of the Editorial Advisory Board, Information Systems, Elsevier

2014 ACM SIGMOD Int. Conf. on Management of Data, Program Committee
2013 ACM SIGMOD Int. Conf. on Management of Data, Demo track, Program Committee
2013 Int. Conf. on Extending Database Technology (EDBT)
2012 ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining
2012 ACM SIGMOD Int. Conf. on Management of Data, Demo track, Program Committee
2012 Int. Conf. on Extending Database Technology (EDBT), Program Committee
2012 Int. Conf. on Extending Database Technology (EDBT), Data Analytics in the Cloud Workshop, Program Committee
2012 IEEE Int. Conf. on Distributed Computing Systems (ICDCS)

2012 ACM Symposium on Applied Computing (SAC), "Mobile Computing and Applications" track, Program Committee
2012 Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK), Program Committee
2012 ACM Int. Workshop on Data Warehousing and OLAP (DOLAP)
2012 Int. Symp. on Methodologies for Intelligence Systems (ISMIS), Warehousing and OLAPing Complex, Spatial and Spatio-Temporal Data track

Program committee membership before 2012:

 ACM SIGMOD Int. Conf. on Management of Data: 2004, 2009, 2010, 2011 (demo)
Int. Conf. on Very Large Databases (VLDB): 2007
IEEE Int. Conf. on Data Engineering (ICDE): 2006, 2007, 2008, 2009, 2010, 2011
ACM SIGKDD Int. Conf. on Knowledge Discovery and Data Mining: 2004
Int. Conf. on Machine Learning (ICML): 2003

ACM Conf. on Information and Knowledge Management (CIKM): 2005, 2006, 2008
Int. Symp. on Temporal Representation and Reasoning (TIME): 2008
Summit of the New England Database Society (NEDSummit): 2011
IEEE Int. Conf. on Intelligence and Security Informatics (ISI) (formerly NSF/NIJ Symp. on Intelligence and Security Informatics (ISI)): 2003, 2004, 2005, 2006, 2007, 2008, 2009, 2010, 2011

ACM Symposium on Applied Computing (SAC): 2011 (Mobile Computing and Applications)
Int. Conf. on Data Warehousing and Knowledge Discovery (DaWaK): 2005, 2006, 2007, 2008, 2011

ACM Int. Workshop on Data Warehousing and OLAP (DOLAP): 2005, 2006, 2007, 2008, 2011
Int. Conf. on Complex, Intelligent and Software Intensive Systems (CISIS): 2010, 2011
East-European Conf. on Advances in Database and Information Systems (ADBIS): 2010

IEEE Int. Conf. on Computational Science and Engineering: 2008
AAAI Nectar (New sCientific and Technical Advances in Research): 2007
Int. Workshop on Mining Multimedia Streams in Large-Scale Distributed Environments (MMSDE): 2008
Int. Conf. on Geosensor Networks (GSN): 2006, 2009
SIGMOD Ph.D. Workshop on Innovative Database Research (IDAR): 2008
Int. Workshop on Scalable Stream Processing Systems (SSPS): 2007, 2008
Int. Conf. of Asian Digital Libraries (ICADL): 2003

Reviewer for leading research journals: ACM Transactions on Database Systems (TODS), ACM Transactions on Information Systems (TOIS), VLDB Journal, IEEE Transactions on Knowledge and Data Engineering (TKDE), IEEE Transactions on Multimedia, IEEE Computer, Data and Knowledge Engineering (DKE), International Journal of Business Intelligence and Data Mining (IJBIDM), Information Systems, Information Processing Letters (IPL), and others