Who’s Got Your Digital Dossier

When users’ per­sonal infor­ma­tion is shared with a web­site or mobile appli­ca­tion, that data enters some­thing of a gray market with many unknowns about where that infor­ma­tion will end up or how it will be used. What’s more, indi­vid­uals have few options for con­trol­ling what is shared with third par­ties, says Dave Choffnes, a mobile sys­tems expert and assis­tant pro­fessor of com­puter and infor­ma­tion sci­ence at North­eastern Uni­ver­sity. Choffnes is devel­oping soft­ware that aims to pro­vide data to researchers hoping to tackle this issue and pro­vide some level of con­trol to users. We asked him about the extent and impli­ca­tions of this third-​​party data sharing.

How are users’ data currently being shared online, and how much are they aware of and able to control?

The short answer is that we don’t know how much sharing is going on. We know that information is collected by businesses, and we know that this information is being monetized through advertising. For example, when I search a travel site for a plane ticket to a destination, I see ads related to that destination on other websites. But surely this kind of sharing is only the tip of the iceberg.

In general, I think users—average or otherwise—are aware of frighteningly little. There are two problems: we lack transparency into what is being shared and have almost no control over how it is shared. Several tools attempt to improve transparency. One, Collusion, allows you to track the trackers for desktop Web browsing. Another, called Meddle, which my team developed, can do the same for mobile app traffic.

We don’t have great tools for controlling what is being shared. There are a number of initiatives to allow consumers to opt out of advertising (e.g., Do Not Track and NAI Opt Out). However, there is no enforcement of such preferences and advertiser participation is voluntary. The problem is one of incentives: the advertisers have little reason to stop tracking if there is no downside for them. Governmental policy changes can help, but we also need better tools to allow users to take control of how they are tracked. With Meddle, we are trying to do the latter.

What are the advantages and disadvantages to having our data shared in this manner?

Data sharing and advertising can generate revenue that allows popular apps and services to be “free” for users—a large fraction of services we enjoy today would not exist without such revenue streams. Of course, this is free as in dollars, but not free as in freedom, since the users relinquish control of potentially sensitive data. The disadvantage is that once data is shared, it is nearly impossible to “unshare” it and we do not yet understand the risks. For example, once shared, this private data may be breached by malicious parties. We have seen negative implications of private data being made public. For example, see this story in Wired.

What needs to be done to improve the state of privacy in this arena, and how much is the user’s responsibility to make that happen?

It seems clear that the current state of affairs is not optimal. As a user, I would like to see more control put into our own hands and I would like it to be free to do so. Realistically, I think users, companies, and policymakers will need to share the responsibility of protecting privacy. This entails revisiting current policies for transparency, control, and pricing when it comes to decisions regarding collecting and using data gathered from users.

I think a key issue is that no one will act to improve privacy until being made aware of potential (and actual) problems that stem from today’s data gathering. To address this, we need to find ways to make such information easily accessible to average users and policymakers. Collusion took a huge leap forward in this direction, and we are trying to build upon this success in Meddle.