Our lives are increasingly influenced by algorithms, from online recommendations to banking decisions and even to predictive policing. While these systems are useful, most of them are "black boxes" with little visibility into how decisions are made, raising concerns of inadvertent discrimination, censorship, and disparate impact.
We are developing techniques to externally audit such systems, allowing policy makers and end users to better understand how input data is used and how it influences the output of these systems. We have successfully applied our techniques to study personalization in Google Search and E-commerce web sites, the surge pricing algorithm of Uber, and algorithmic pricing on the Amazon Marketplace. Our work has received significant press coverage and is funded by the NSF and the Knight Foundation.
Cloud computing has evolved to meet user demands, from arbitrary VMs offered by IaaS to the narrow application interfaces of PaaS. Unfortunately, there exists an intermediate point that not well met by today's offerings: users who wish to run arbitrary, already available binaries yet expect their applications to be long-lived but mostly idle.
We are exploring an alternative approach for cloud computation based on a process-like abstraction rather than a virtual machine abstraction, thereby gaining the scalability and efficiency of PaaS along with the generality of IaaS. We get the best of both worlds by enabling fast swapping of applications to and from cloud storage (since, by definition, applications are largely idle, we expect them to spend the majority of their time swapped out). Our work is in collaboration with Duke and UMD, and is generously funded by the NSF.
Every second, the thoughts and feelings of millions of people around the world are recorded on Twitter. Since Twitter's inception in 2006, we have been collecting a large sample of tweets; this data set is now over 65 billion tweets and counting.
We have used our large-scale Twitter data to uncover a number of interesting findings, including a study of the evolution of user behavior, the relationship between weather and mood, and how users behave across OSNs.
We have also used other OSN data to better understand the OSN ad market as well as users are valued by OSN provders. Our work is funded by the NSF, the ARO, and Narus.
SSL and the PKI secure Internet transactions such as banking, e-mail, and e-commerce by providing trusted identities and private communication. Unfortunately, the PKI that is in-use today is surprisingly brittle, and there have been numerous incidents where SSL certificates have been compromised or the PKI has been mis-managed.
We are trying to better understand the weaknesses of the current PKI, with the goal of developing tools and techniques to improve security for internet users. We have explored how sites (often failed to) respond to the Heartbleed bug and how browsers rarely bother to check the revocation status of certificates. Our work is in collaboration with researchers from Duke, UMD, and Stanford.
Multiple identity (Sybil) attacks pose a fundamental problem in distributed systems, affecting sites ranging from online social networks to content rating sites. These attacks allow malicious users to obtain more privileges than they would otherwise have, leading to problems such as follower fraud on Twitter and manipulated votes on Yelp.
We have been developing approaches to better understand and prevent Sybil attacks. We developed a technique for designing systems that sidesteps Sybil attacks, and demonstrated how to design communication systems, content rating systems, online marketplaces, and social networks in this style. We have also developed techniques that can automatically identify suspicious users and suspicious groups of users without needing to know their attack strategy. Our work has been funded by the NSF and Google.
Middleboxes are commonly deployed by ISPs to implement traffic policies such as shaping, proxying, and transcoding. While middleboxes may be used for network management purposes, they may also be applied opaquely to limit access to (or degrade) services which compete with those offered by the network provider.
We are developing tools and techniques to better understand when such middleboxes are deployed, what traffic they are affecting, and what the ISP policies are. We have successfully presented an approach to detect traffic differentiation by mobile networks, and have released the Differentation Detector app to allow users to test their own ISPs.