Iolaus

Securing online content rating systems against Sybil attacks and rate-buying

Motivation

Content sharing sites allow users to find and share content.

Examples: news articles (Digg), videos (YouTube), URLs.

System can be gamed by to manipulate content rating:

    Sybil attacks: vote multiple times with multiple accounts.

    Buying votes: from legitimate users by offering compensation

But, accounts are often not verified and free to create. Usually only requires email address + CAPTCHA (Multiple identities referred to as Sybils [IPTPS’02]).

Can make fraudulent content appear highly rated, or can make legitimate content appear poorly rated

Sybil attacks observed on real-world sites

Digg (NDSI’09)

TripAdvisor [NYT, 08/20/2011]

Labor markets [USENIX SECURITY’11]

Goal and assumptions

Goal:  Create system to bound the effect of creating Sybils or buying ratings

                Key idea:  Leverage underlying social network and use relative ratings

                Social network often already exists on these sites

Design

Our approach is to assign weights to votes and use relative ratings; not all votes are counted equally!


Goal: Assign weights so that user's aggregate weight does not depend on number of identities they possess.


Naturally mitigates the effect of Sybils.

Challenge is choosing weight assignment algorithm


We use flow over the social network to assign weights

  1.     Every voter is a source; each link has unit capacity

  2.     User asking for rating is the sink (collector)

 


Model problem as multi-commodity max flow problem

  1. Users “compete” to push flow to vote collector; determines vote weight.

  2.   User influence only dependent on number of real links; Sybils don’t help.

Relative rating

Defend against vote-buying attacks

Otherwise legitimate users paid small

           compensation of rating.


Convert raw rating to relative rating

Measure item against all other items the user

          rated, ranges between 0 and 1.


Most users provide few ratings

Their relative rating doesn’t provide much    

          benefit.

Related work

DSybil [oakland’09] finds trusted guides (users who have a similar voting history)

  1.     Assumes all users provide enough feedback to find guides

  2.     Many users don't vote/feedback in practice


SumUp [nsdi’09] uses social network; inspired our design

  1.     Defines a trusted "envelope", where all votes are counted

  2.     Nodes outside must compete

All have basic mechanisms:

  1. Creating accounts

  2. Declaring friendships

  3. Uploading and rating content (or voting)

  4. Locating content via aggregated votes

Assumption: 

  1.     Social links to honest users take effort to form and maintain

  2.     Malicious user cannot obtain arbitrary links to honest users

  3.     Most users provide few ratings

Arash Molavi Kakhki, Chloe Kilman-Silver, and Alan Mislove

To appear in Proceedings of the 22nd International World Wide Web Conference (WWW'13), Rio de Janeiro, Brazil, May 2013.

[PDF] [BibTex]


Data Sets

Data from our WWW 2013 paper is available in an anonymized form for other researcher.

Email me for link to the data.


This project was funded in part by a Google Faculty Research Award.