Leveraging Social Networks to Defend against Sybil attacks
Krishna Gummadi
Networked Systems Research GroupMax Planck Institute for Software Systems
Germany
Sybil attack• Fundamental problem in distributed systems
• Attacker creates many fake identities (Sybils)– Used to manipulate the system
• Many online services vulnerable– Webmail, social networks, p2p
• Several observed instances of Sybil attacks– Ex. Content voting tampered on YouTube, Digg
Sybil defense approaches
• Tie identities to resources that are hard to forge or obtain
• RESOURCE 1: Certification from trusted authorities– Ex. Passport, social security numbers– Users tend to resist such techniques
• RESOURCE 2: Resource challenges (e.g., crypto-puzzles)– Vulnerable to attackers with significant resources– Ex. Botnets, renting cloud computing resources
• RESOURCE 3: Links in a social network?
Using social networks to detect Sybils• Assumption: Links to good users hard to form and
maintain– Users mostly link to others they recognize
• Attacker can only create limited links to non-Sybil users
Leverage the topological feature
introduced by sparse set of links
Social network-based Sybil detection
• Very active area of research– Many schemes proposed over past five years
• Examples:– SybilGuard [SIGCOMM’06]– SybilLimit [Oakland S&P ’08]– SybilInfer [NDSS’08]– SumUp [NSDI’09]– Whanau [NSDI’10]– MOBID [INFOCOM’10]
But, many unanswered questions
• All schemes make two common assumptions– Honest nodes: they are fast mixing– Sybils: they do not mix quickly with honest nodes
• But, each uses a different graph analysis algorithm– Unclear relationship between schemes
• Is there a common insight across the schemes?– Is there a common structural property these schemes rely on?
• Such an insight is necessary to understand– How well would these schemes work in practice?– Are there any fundamental limitations of Sybil detection?
Common insight across schemes
• All schemes find local communities around trusted nodes– Roughly, set of nodes more tightly knit than surrounding graph
• Accept service from those within the community– Block service from the rest of the nodes
Are certain network structures more vulnerable?
• When honest nodes divide themselves into multiple communities – Cannot tell apart Sybils & non-Sybils in a distant community
• How often do social networks exhibit such community structures?
Trusted NodeTrusted Node
How often do non-Sybils form one cohesive community?
• Not often!• Many real-world social networks have high modularity
– They exhibit multiple well-defined community structures
Facebook RICE undergraduates’ network
• Exhibits densely connected user communities within the graph
• Other social networks have even higher modularity
How often do non-Sybils form one cohesive community?
• Traditional methodology:– Analyze several real-world social network graphs– Generalize the results to the universe of social networks
• A more scientific method:– Leverage insights from sociological theories on communities– Test if their predictions hold in online social networks– And then generalize the findings
Group attachment theory
• Explains how humans join and relate to groups
• Common-identity based groups– Membership based on self interest or ideology– E.g., NRA, Greenpeace, and PETA– Tend to be loosely-knit and less cohesive
• Common-bond based groups– Membership based on inter-personal ties, e.g., family or kinship– Tend to form tightly-knit communities within the network
Dunbar’s theory
• Limits the # of stable social relationships a user can have– To less than a couple of hundred– Linked to size of neo-cortex region of the brain
• Observed throughout history since hunter-gatherer societies
• Also observed repeatedly in studies of OSN user activity– Users might have a large number of contacts– But, regularly interact with less than a couple of hundred of them
• Limits the size of cohesive common-bond based groups
Prediction and implication
• Strongly cohesive communities in real-world social networks will be necessarily small– No larger than a few hundred nodes!
• If true, it imposes a limit on the number of non-Sybils we can detect with high accuracy– Will be problematic as social networks grow large
Verifying the prediction
Real-world data sets analyzed
Implications
• Fundamental limits on social network-based Sybil detection
• Can reliably identify only a limited number of honest nodes
• In large networks, limits interactions to a small subset of honest nodes– Might still be useful in certain scenarios, e.g., white listing email
from friends
• But, what to do with nodes not in the honest node subset?
One way forward: Sybil tolerance
• Rather than detect bad nodes, lets limit bad behavior
• Sybil detection: Use network to find Sybil nodes– Accept / receive unlimited service from non-Sybils– Refuse to interact with Sybils
• Sybil tolerance: Use network to limit nodes’ privileges– Interact with all nodes, but monitor their behavior– Limit bad behavior from any node, Sybil or non-Sybil
x
Destination
Source
Destination
x
Illustrative example: Applying Sybil tolerance to email spam
• Key idea: Link privileges to credit on network links– Once the credit is exhausted, the node stops receiving service – Does not matter if the node is a Sybil or not
Illustrative example: Applying Sybil tolerance to email spam
• Creating multiple node identities does not help– So long as they cannot create links to arbitrary honest nodes
• No assumption about connectivity between non-Sybils
{MultipleIdentities
Such Sybil tolerant systems already exist
• Ostra [NSDI’08]: Limiting unwanted communication
• SumUp [NSDI’09]: Sybil-resilient voting
• Their properties were not well understood before
Sybil detection versus tolerance
• Sybil detection– Assumes network of honest nodes is fast mixing– Does not require anything beyond network topology
• Sybil tolerance– No assumption about connectivity between honest nodes– Requires user behavior to be monitored and labeled
Summary: A comprehensive approach to social network-based Sybil defense
Thank you!
Questions?