People 
Courses 
Seminar 
Sep 21, 2016 12:30PM 
Daniel Wichs and Jack Doerner 366 WVH 
Sep 28, 2016 12:30PM 
Yaron Singer, Harvard 366 WVH Abstract
In this talk we will consider the problem of maximizing a monotone submodular function under noise. There has been a great deal of work on optimization of submodular functions under various constraints, with many algorithms that provide desirable approximation guarantees. However, in many applications we do not have access to the function we aim to optimize, but rather to some erroneous or noisy version of it. This raises the question of whether provable guarantees are obtainable in presence of error and noise. We provide initial answers, by focusing on the question of maximizing a monotone submodular function under a cardinality constraint when given access to a noisy oracle of the function.

Oct 05, 2016 12:15PM 
Huy Nguyen, Northeastern 366 WVH Abstract In the turnstile L_p heavy hitters problem with parameter eps, one must maintain a highdimensional vector x in R^n subject to updates of the form update(i, Delta), which increases x_i by Delta, where i is in [n], and Delta is an integer. Upon receiving a query, the goal is to report every ``heavy hitter'' i in[n] with x_i >= eps*x_p as part of a list L, which is a subset of [n] of size O(1/\eps^p), i.e. proportional to the maximum possible number of heavy hitters. In fact we solve the stronger tail version, in which L should include every i such that x_i >= \eps x_{proj{1/eps^p}}_p and x_i > 0, where x_{proj{k}} denotes the vector obtained by zeroing out the largest k entries of x in magnitude. We provide a new algorithm, which in the most general turnstile model achieves optimal O(\eps^{p}\log n) space, O(\log n) update time, and fast O(\eps^{p}\poly(\log n)) query time, providing correctness whp. Our main innovation is an efficient reduction from the heavy hitters to a clustering problem in which each heavy hitter is encoded as some form of noisy spectral cluster in a much bigger graph, and the goal is to identify every cluster. Since every heavy hitter must be found, correctness requires that every cluster be found. We thus need a ``clusterpreserving clustering'' algorithm, that partitions the graph into clusters with the promise of not destroying any original cluster. To do this we first apply standard spectral graph partitioning, and then we use some novel combinatorial techniques to modify the cuts obtained so as to make sure that the original clusters are sufficiently preserved. Our clusterpreserving clustering may be of broader interest much beyond heavy hitters. Joint work with Kasper Green Larsen, Jelani Nelson, and Mikkel Thorup. 
Oct 12, 2016 12:30PM 
Mohsen Ghaffari, MIT/ETH Zurich 366 WVH (Abstract) How can the computers in a network interact and communicate to solve the network's graph problems efficiently? This is the subject of the area of (local) distributed graph algorithms, which dates back to the 1980s. The recent years have witnessed significant developments in this classical area. In this talk, we survey some of these developments. Noting the gap between randomized and deterministic distributed graph algorithms, the talk has three (closelyrelated) parts: 1) On the randomized side, we present the first nearlyoptimal randomized distributed algorithm for maximal independent set. Using wellknown reductions, this provides a unified efficient algorithm for all the classical problems of the area. 2) On the deterministic side, we present an improved deterministic distributed algorithm for edge coloring, which almost settles a quartercentury old open problem. 3) We conclude with mentioning a surprising result that pinpoints the gap between randomized and deterministic distributed algorithms: we present a rudimentarylooking rounding problem, which admits a trivial 0round randomized algorithm, and prove that if one can solve it deterministically in polylog(n) rounds, we would get polylog(n)round deterministic distributed algorithms for all the local problems. The latter remains the most wellknown open question of the area. 
Oct 19, 2016 12:30PM 
Mor Weiss, Northeastern 366 WVH (Abstract) In the past few decades, probabilistic checking techniques were shown to yield dramatic efficiency improvements in verifying proofs and approximating distance from errorcorrecting codes. We show connections between cryptography and "zeroknowledge" variants of these techniques, such as Probabilistically Checkable Proofs (PCPs) and PCPs of Proximity (PCPPs), and use them to improve the complexity of cryptographic protocols. In this talk we will discuss two constructions: 1. PCPs for NP with zeroknowledge guarantees, that can be verified nonadaptively, namely by making a single round of queries to the proof. Our construction is based on a novel connection to leakageresilient circuits. 2. Zeroknowledge PCPPs for NP, a generalization of zeroknowledge PCPs, which we construct by combining standard PCPPs with secure multiparty computation protocols. Based on joint works with Yuval Ishai, Amit Sahai, Michael Viderman and Guang Yang. 
Jun 16, 2016 3:30PM 
Ryan Rogers, University of Pennsylania 366 WVH (Abstract)
In this work, we initiate a principled study of how the generalization properties of approximate differential privacy can be used to perform adaptive hypothesis testing, while giving statistically valid pvalue corrections. We do this by observing that the guarantees of algorithms with bounded approximate maxinformation are sufficient to correct the pvalues of adaptively chosen hypotheses, and then by proving that algorithms that satisfy (approximate) (epsilon,delta)differential privacy have bounded approximate max information when their inputs are drawn from a product distribution. This substantially extends the known connection between differential privacy and maxinformation, which previously was only known to hold for (pure) (epsilon,0)differential privacy. It also extends our understanding of maxinformation as a partially unifying measure controlling the generalization properties of adaptive data analyses. We also show a lower bound, proving that (despite the strong composition properties of maxinformation), when data is drawn from a product distribution, approximate differentially private algorithms can come first in a composition with other algorithms satisfying maxinformation bounds, but not necessarily second if the composition is required to itself satisfy a nontrivial maxinformation bound. This, in particular, implies that the connection between approximate differential privacy and maxinformation holds only for inputs drawn from product distributions, unlike the connection between pure differential privacy and maxinformation.

Apr 14, 2016 2:50PM 
Gabor Lippner, Northeastern University Dept. of Mathematics 366 WVH (Abstract)
A local algorithm aims to compute some global structure on a graph using only simultaneous, local, decisions at the nodes. I will review recent developments on the possibilities and the limitations of such algorithms, and explain their connection to the theory of graph limits.

Apr 7, 2016 1:30PM 
Huy Nguyen, TTI Chicago 366 WVH (Abstract)
Challenges abound in modern large scale data analysis, ranging from the sheer volume of the data to its complex structure and the need for timely responses. A promising common approach centers on capturing key information from the data using concise representations that can be constructed in a distributed manner or from an evolving stream of data. In this talk, we will illustrate both the power and limitations of this approach using three research vignettes. In the first example, we describe an algorithmic framework for fundamental linear algebra problems based on short summaries. In the second example, we design distributed algorithms for a family of nonconvex problems arising in learning applications. In the third example, we show that even basic statistical estimation tasks require large summaries.

Apr 4, 2016 10:30AM 
Matt Weinberg, Princeton 366 WVH (Abstract) When real people interact with algorithms (e.g. in auctions, crowdsourcing, Bitcoin, etc.), they impose additional desiderata beyond simply that the algorithm is correct, fast, and uses little storage. People strategize during these interactions, so algorithms deployed in these settings must be robust against strategic manipulation. Additionally, people prefer transparent interactions, so these algorithms must also be as simple as possible. My research addresses these, and other novel challenges that arise when algorithms interact with strategic agents. In this talk, I will focus on robustness against strategic manipulation, and present a new algorithmic framework for these settings. Specifically, I will present a blackbox reduction from solving any optimization problem in strategic settings, where the input is held by selfish agents with interests of their own, to solving a perturbed version of that same optimization problem in traditional settings, where the input is directly given. I will further apply this framework to resolve two longstanding open problems: extending Myerson's celebrated characterization of optimal singleitem auctions to multiple items (Myerson 1981), and designing truthful mechanisms for job scheduling on unrelated machines (Nisan and Ronen 1999). Finally, I will briefly show how strategic considerations motivate nice questions in "traditional" areas of algorithm design as well, and present some of my work in convex optimization, parallel algorithms, and prophet inequalities. 
Mar 23, 2016 10:30AM 
Sofya Raskhodnikova, Penn State 366 WVH (Abstract) Massive datasets are becoming increasingly common. What useful computations can be performed on a dataset when reading all of it is prohibitively expensive? This question, fundamental to several fields, is at the heart of the research area, called sublineartime algorithms, that has provided important insights into fast approximate computation. In this talk, we will consider types of computational tasks central to sublineartime algorithms: approximation, testing, and learning. We will see examples of sublineartime algorithms in several domains. The algorithms themselves are typically simple and efficient, but their analysis requires insights into basic combinatorial, algebraic, and geometric questions. We will also discuss new directions in sublineartime algorithms, including new computational tasks, new measures for accuracy guarantees, and new models for data access. These directions enable applications of sublineartime algorithms in privacy, analysis of realvalued data, and situations where the data is noisy or incomplete. 
Mar 16, 2016 10:30AM 
Mohit Singh, Microsoft Research Redmond 366 WVH (Abstract) In this talk I will discuss the problem of computing maximum entropy distributions over a discrete set of objects subject to observed marginals. While there has been a tremendous amount of interest in such distributions due to their applicability in areas such as statistical physics, economics, information theory, machine learning, combinatorics and algorithms, a rigorous and systematic study of how to compute such distributions had been lacking. In this talk, I will show that the problem of computing maximum entropy distributions, in a fairly general setting, is equivalent to a counting problem on the underlying set. One of the consequences of this equivalence is that, for several combinatorial settings, there is an efficient algorithm to compute maxentropy distributions. I will also talk about applications of maximum entropy distributions to design approximation algorithms for discrete optimization problems including the Traveling Salesman Problem (TSP). These results include the first improvements over the classical Christofides' algorithm from 1970's for TSP on graph metrics. 
Mar 15, 2016 4:00PM 
Nils Olberg, Aachen 366 WVH (Abstract) In the concurrent multicast problem, we are given a graph representing a communication network together with a set of senderreceiver pairs. The goal is to deliver the message of each sender to the corresponding receiver. The concurrent multicast problem and its variants are motivated by information dissemination problems arising in diverse network scenarios. In this talk, we consider two of the most prominent communication models, namely the telephone and the radio model. For each of these models, our aim to develop algorithms that give good approximation algorithms, since the underlying problems are known to be NPhard. Using sparse partition covers we provide a first nontrivial bound for radio concurrent multicast. We further demonstrate the connection of telephone concurrent multicast to other network design problems, which motivates the development of approximation algorithms for the latter. Finally, we show that our methods can be extended to yield new results for minimum degree graph spanners. 
Mar 14, 2016 10:30AM 
Vipul Goyal, Microsoft Research India 366 WVH (Abstract) A central challenge in the design of secure systems is to defend against maninthemiddle attacks, where an adversary can arbitrarily tamper with the messages exchanged by two parties over a communication channel. Starting with the early nineties, an important research goal in cryptography has been to build ``nonmalleable'' cryptographic protocols that are resilient to such attacks. In this talk, I will describe my work that culminates this twodecade long research quest by constructing roundoptimal nonmalleable protocols based on almost minimal cryptographic assumptions. I will also discuss how the techniques developed in these works have transcended cryptography and found applications in randomness extraction, coding theory, and complexity theory. I will also briefly talk about my work on applied cryptography and its impact. 
Feb 29, 2016 10:30AM 
Timothy Chan, University of Waterloo 366 WVH (Abstract) Although data structures in computational geometry have been studied since the 1970s, recent years have seen exciting new developments that deepen our theoretical understanding of some of the most fundamental problems in the area. In this talk, I will survey work on a number of these core problems, including (static and dynamic) 2D point location, 2D nearest neighbor searching, and orthogonal and nonorthogonal range searching. If time permits, I will also mention some unexpected applications of geometric data structures, to text indexing and allpairs shortest paths. 
Feb 9, 2016 10:30AM 
Mary Wootters, Carnegie Mellon University 366 WVH (Abstract) Error correcting codes are an important tool in communication, and the family of ReedSolomon (RS) codes is an especially important example. I'll introduce error correcting and ReedSolomon codes, and I'll talk about three very different ways that RS codes have come up in my research. These three ways (none of which immediately has to do with communication) run the gamut from theory to practice. The first way, on the very theoretical end, is a combinatorial problem called list decoding, which has motivations in complexity theory. The second way, a bit more downtoearth, has to do with designing schemes for distributed storage: how do we store a file on several servers robustly? The third way, on the practical side of things, is combinatorial group testing: in our work we collaborated with computational biologists to design better algorithms for highthroughput genetic screening and de novo gene sequencing. 
Jan 21, 2016 3:30PM 
Zhiwei Steven Wu, University of Pennsylvania 366 WVH (Abstract) We initiate the study of a quantity that we call coordination complexity. In a distributed optimization problem, the information defining a problem instance is distributed among n parties, who need to each choose an action, which jointly will form a solution to the optimization problem. The coordination complexity represents the minimal amount of information that a centralized coordinator, who has full knowledge of the problem instance, needs to broadcast in order to coordinate the n parties to play a nearly optimal solution. We show that upper bounds on the coordination complexity of a problem imply the existence of good jointly differentially private algorithms for solving that problem, which in turn are known to upper bound the price of anarchy in certain games with dynamically changing populations. We show several results. We fully characterize the coordination complexity for the problem of computing a manytoone matching in a bipartite graph by giving almost matching lower and upper bounds.Our upper bound in fact extends much more generally, to the problem of solving a linearly separable convex program. We also give a different upper bound technique, which we use to bound the coordination complexity of coordinating a Nash equilibrium in a routing game, and of computing a stable matching. 
Jan 6, 2016 12:00PM 
ChienChung Huang, Chalmers University of Technology 366 WVH (Abstract) We propose new exact and approximation algorithms for the weighted matroid intersection problem. Our exact algorithm is faster than previous algorithms when the largest weight is relatively small. Our approximation algorithm delivers a $(1\epsilon)$approximate solution with a running time significantly faster than most known exact algorithms. The core of our algorithms is a decomposition technique: we decompose an instance of the weighted matroid intersection problem into a set of instances of the unweighted matroid intersection problem. The computational advantage of this approach is that we can make use of fast unweighted matroid intersection algorithms as a black box for designing algorithms. Precisely speaking, we prove that we can solve the weighted matroid intersection problem via solving $W$ instances of the unweighted matroid intersection problem, where $W$ is the largest given weight. Furthermore, we can find a $(1\epsilon)$approximate solution via solving $O(\epsilon^{1} \log r)$ instances of the unweighted matroid intersection problem, where $r$ is the smallest rank of the given two matroids. 
Dec 8, 2015 3:30PM 
Elliot Anshelevich, RPI 366 WVH (Abstract)
We examine the quality of social choice mechanisms using a utilitarian view, in which all of the agents have costs for each of the possible alternatives. While these underlying costs determine what the optimal alternative is, they may be unknown to the social choice mechanism; instead the mechanism must decide on a good alternative based only on the ordinal preferences of the agents which are induced by the underlying costs. Due to its limited information, such a social choice mechanism cannot simply select the alternative that minimizes the total social cost (or minimizes some other objective function). Thus, we seek to bound the distortion: the worstcase ratio between the social cost of the alternative selected and the optimal alternative. Distortion measures how good a mechanism is at approximating the alternative with minimum social cost, while using only ordinal preference information. The underlying costs can be arbitrary, implicit, and unknown; our only assumption is that the agent costs form a metric space, which is a natural assumption in many settings. We quantify the distortion of many wellknown social choice mechanisms. We show that for both total social cost and median agent cost, many positional scoring rules have large distortion, while on the other hand Copeland and similar mechanisms perform optimally or nearoptimally, always obtaining a distortion of at most 5. We also give lower bounds on the distortion that could be obtained by *any* deterministic social choice mechanism, and extend our results on median agent cost to more general objective functions.

Nov 12, 2015 3:30PM 
Cris Moore, Santa Fe Institute 177 Huntington Ave, 11th Floor (Abstract) October 22, 3:305:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NPcompleteness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:305:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The kcore and discontinuous transitions November 5, 3:305:00 PM Lecture III: Phase transitions in random kSAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:305:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering 
Nov 10, 2015 12:00PM 
Bobby Kleinberg 366 WVH (Abstract) How much information can be learnt about an individual by observing the outcome of a computation? If the computation is differentially private, the answer is: "not much more than if the individual's data had been excluded from the input." A stronger notion of privacy, originally propounded by Dalenius in the 1970's, requires instead that it should not be possible to learn much more about an individual than if the outcome of the computation had never been revealed. Simple examples, as well as a general impossibility theorem of Dwork and Naor, preclude the possibility of supplying this stronger "inferential" guarantee against attackers with arbitrary auxiliary information, assuming the computation is at least minimally useful. In this talk we revisit the notion of inferential privacy and ask: under what limitations on the adversary's side information can we deduce an inferential guarantee from a differential one? We model the adversary's side information as a prior distribution over datasets (or, more generally, a set of possible priors) and prove two main results. The first result pertains to distributions that satisfy a natural positiveaffiliation condition, and gives an upper bound on the inferential privacy guarantee for any differentially private mechanism. This upper bound is matched by a simple mechanism that adds Laplace noise to the sum of the data. The second result pertains to distributions that have weak correlations, defined in terms of a suitable "influence matrix". The result provides an upper bound for inferential privacy in terms of the differential privacy parameter and the spectral norm of this matrix. 
Nov 5, 2015 3:30PM 
Cris Moore, Santa Fe Institute 177 Huntington Ave, 11th Floor (Abstract) October 22, 3:305:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NPcompleteness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:305:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The kcore and discontinuous transitions November 5, 3:305:00 PM Lecture III: Phase transitions in random kSAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:305:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering 
Oct 29, 2015 3:30PM 
Cris Moore, Santa Fe Institute 177 Huntington Ave, 11th Floor (Abstract) October 22, 3:305:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NPcompleteness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:305:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The kcore and discontinuous transitions November 5, 3:305:00 PM Lecture III: Phase transitions in random kSAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:305:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering 
Oct 22, 2015 3:30PM 
Cris Moore, Santa Fe Institute 177 Huntington Ave, 11th Floor (Abstract) October 22, 3:305:00 PM Lecture I: Computational complexity and landscapes o Two puzzles: Euler vs. Hamilton o P vs. NP and NPcompleteness: building computers o When greed works: Minimum Spanning Tree and Max Flow o Bumpy energy landscapes and local optima October 29, 3:305:00 PM Lecture II: Phase transitions in random graphs o The emergence and size of the giant component o Branching processes and differential equations o Power laws at criticality o The kcore and discontinuous transitions November 5, 3:305:00 PM Lecture III: Phase transitions in random kSAT o Early history and phenomenology o First and second moment bounds o Algorithms, clustering, and frozen variables o Why we believe in a regime where solutions exist but are hard to find November 12, 3:305:00 PM Lecture IV: Community detection in networks o The stochastic block model o The analogy between statistical physics and statistical inference o Belief propagation and variational methods o The detectability transition o (If there's time) Spectral clustering 