privacy management schemes for social...
TRANSCRIPT
Privacy management schemes for social networking sites
Thesis submitted in the partial fulfillment
of the requirements for the degree of
Master of Science (By Research)
in
Computer Science and Engineering
By
NAGARAJA KAUSHIK GAMPA
200707017
Center for Security, Theory & Algorithmic Research (C-STAR), International Institute of Information Technology,
Hyderabad, India 500032.
International Institute of Information Technology
Hyderabad, India
CERTIFICATE
It is certified that the work contained in this thesis, titled “Privacy management
schemes for social networking sites” by Nagaraja Kaushik Gampa (200707017)
submitted in the partial fulfillment for the award of the degree of Master of Science (By
Research) in Computer Science and Engineering, has been carried out under my
supervision and it is not submitted elsewhere for a degree.
____________________ _______________________________
Date Advisor: Dr. Kannan Srinathan
Acknowledgments
I owe my deepest gratitude to Dr. Kannan Srinathan, Assistant Professor, CSTAR,
International Institute of Information Technology, who has been my advisor during the
course. He provided me with many helpful suggestions, important advice and constant
encouragement during the course of this work.
I would also like to thank to all my friends and CSTAR colleagues who have been
encouraging and supportive throughout my journey in IIIT-Hyderabad.
My special appreciation goes to my parents, Arvind Kumar Gampa and Meera Bai
Gampa, for their unconditional love which has given me the strength to try and achieve
more and to be a better person.
i
ABSTRACT
Security experts often say, users are the weakest link in a security system. Users misunderstand
how to use security mechanisms and do not realize the protection and are happy to circumvent the
security measures, if security measures try to impede their primary tasks. Attackers on the other
hand are experts in usability: they exploit user’s lack of understanding and their tendencies not to
comply with security protocols and policies by developing simple yet effective social
engineering attacks. Below we explain two main important problems the users are facing in the
social networking sites. The first one is identifying between strangers and friends and the second
one is adding the people who are interested in the community.
Current social networking sites protect user data by making it available only to the restricted set
of people often friends. However, the concept of ‘friend’ in social networks is illusory. Adding a
person to the friends list without verifying her identity can lead to many serious consequences
like identity theft, privacy loss, etc. We propose a novel verification paradigm to ensure that the
person who sends you a friend request is actually your friend and not someone who is faking her
identity. Our solution is based on what a person might know and can verify about the other
person. We work on a premise that a friend can say about her friend’s preferences better than the
stranger. The preferences include the interests of a particular person and the big five personality
traits1. To verify our premise, we did a two stage user study. Results of the user study are quite
encouraging.
The lifeline of a social networking site is its community relationship model. A community is a
group of people having similar tastes, interests and lifestyle. Communities can take the form of an
online forum, discussion group where the active participation of a user is required. As of now
there is no provision in the social networking sites to measure the interest of the members of a
particular community. Inclusion of uninterested users into the community will negatively impact
the quality of the community and its activities. Often it so happens that attackers disguised as
normal users try to join all the communities even if they are not interested in them. These users
1 Oliver P. John and Srivastava, S. The Big-Five Trait Taxonomy: History, Measurement, and Theoretical Perspectives.
In Handbook of Personality: Theory and Research (1999), pp. 102-138. University of California at Berkeley.
ii
are always a threat to the online community in the form of spammers, data thieves and malware
injectors. We propose a novel solution which filters out the users based on their interest in the
community. Our scheme is based on challenge-response scheme that works on the premise that a
user who is interested in the community will have knowledge about the community and can
answer to the questions about that community. We verified our premise with the help of user
study. The results of our study were encouraging thereby clearly demarcating the interested and
non-interested users.
iii
Table of Contents
Chapter 1: Introduction............................................................................................................... 1
1.1 Definition .......................................................................................................................... 4
1.2 History .............................................................................................................................. 4
1.3 Thesis Statement ............................................................................................................... 5
1.4 Let only the right one in..................................................................................................... 6
1.5 Sammelan ......................................................................................................................... 7
1.6 Overview of the Thesis .................................................................................................... 10
Chapter2: Let Only the Right One In: Privacy management scheme for social networks. ........... 12
2.1 Background ..................................................................................................................... 12
2.2 Related work ................................................................................................................... 14
2.3 Motivation ...................................................................................................................... 15
2.4 Challenge response schemes ........................................................................................... 16
2.5 Naive Approach: Verify about the sender ........................................................................ 16
2.6 Our Approach: Verify About the Receiver ........................................................................ 17
2.6.1 User verification using Preferences ........................................................................... 18
2.6.1.2 Advantages ............................................................................................................ 19
2.7 User study ....................................................................................................................... 19
2.8 Second phase user study ................................................................................................. 22
2.9 Results ............................................................................................................................ 22
Chapter 3 : Sammelan: Secure Communities of Shared Interests ............................................... 25
3.1 Related Work .................................................................................................................. 25
3.2 Motivation ...................................................................................................................... 27
3.3 Procedure ....................................................................................................................... 28
3.3.1 Traditional Approach ................................................................................................ 28
3.4 Challenge Response Scheme ........................................................................................... 29
3.5 Naive Approach 1 ............................................................................................................ 30
3.6 Naive Approach 2 ............................................................................................................ 32
iv
3.7 Sammelan ....................................................................................................................... 34
3.8 Method ........................................................................................................................... 35
3.8.1 Moderator’s side ...................................................................................................... 36
3.8.2 User’s side ................................................................................................................ 36
3.8.3 System’s side ............................................................................................................ 37
3.9 Security Analysis .............................................................................................................. 37
3.10 User Study ..................................................................................................................... 39
3.11 Results .......................................................................................................................... 40
3.12 System Usability Test ..................................................................................................... 41
Chapter 4: Conclusion ............................................................................................................... 44
References ................................................................................................................................ 46
v
List of Figures
Figure 1: (a) face to face communication (b) Communication by sending postcards (c)
Communication by telephone ...................................................................................................... 2
Figure 2: (a) email sending (b) use of instant messenger .............................................................. 3
Figure 3: Steps following while adding friends in social network .............................................. 15
Figure 4: Naive Approach of Verifying the Sender .................................................................... 17
Figure 5: Better approach of verifying the receiver .................................................................... 18
Figure 6: Liking for the items belonging to a) Sports category b) Interest category .................... 21
Figure 7: Prediction difference by participants for friend and a stranger ..................................... 23
Figure 8: Traditional approach of joining an online community ................................................. 29
Figure 9: Naive approach 1 for adding a user into the community .............................................. 31
Figure 10: Naive approach 2 for adding a user into the community ............................................ 32
Figure 11: Our Approach for a User to join into the community ................................................ 35
Figure 12: Moderator creating the community and entering keywords ....................................... 36
Figure 13: User choosing the keywords corresponding to the community .................................. 37
Figure 14: Graph showing difference in average scores of Interested and not interested users .... 40
Figure 15: Graph showing the differences in the user's scores in knowledge test ........................ 41
Figure 16: Bar graph showing opinion of users on 3 point scale in System Usability Test .......... 42
vi
List of Tables
Table 1: Most liked and disliked items for each category ...........................................................20
Table 2: Sample questtionare for second stage user study...........................................................22
Table 3: Guesses of participants about their friends and strangers ..............................................23
Table 4: Details of sessions........................................................................................................39
Table 5: Scores of the knowledge test ........................................................................................40
1
Chapter 1
Introduction
Humans are the intelligent species in this world. They invented many things and discovered
many. One of the most important things is the communication. A communication between two or
more people is called as a conversation. It is a social skill that most of the people have in them
which is not at all difficult. Conversations are ideal form of communication in some respects,
since they allow people with different views on a topic to learn from each other. A successful
conversation includes mutual interests between the speakers or the things the speakers know. For
this to know those people who are engaging in a conversation must find a topic in which both can
relate. The conversation can be on any topic.
The conversation can be done through different methods. As said a conversation is an ideal form
of communication. Communication is a process that involves exchange of information, thoughts,
ideas and emotions. Communication is a process that involves a sender who encodes and sends
the message, which is then carried via the communication channel to the receiver where the
receiver decodes the message, processes the information and sends an appropriate reply via the
same communication channel. Communication can be used through various processes and
methods and depending upon the channels used and styles of communication there can be various
types of communication. Let us take the different form of communication that can be categorized
through which people can communicate with each other. The types of communication can be
divided into
Verbal communication:
Verbal communication is further divided into two parts oral communication and written
communication. The oral communication refers to the spoken words in the communication
process. Oral communication can either be face-to-face communication or a conversation
over the phone or on the voice chat over the Internet. The other type of verbal
communication is written communication. Written communication can be either via snail
mail, or email. The effectiveness of written communication depends on the style of writing,
vocabulary used, grammar, clarity and precision of language.
2
Figure 1: (a) face to face communication (b) Communication by sending postcards (c)
Communication by telephone
Non- verbal communication:
Non-verbal communication includes the overall body language of the person who is speaking,
which will include the body posture, the hand gestures, and overall body movements. The
facial expressions also play a major role while communication since the expressions on a
person‘s face say a lot about his/her mood. On the other hand gestures like a handshake, a
smile or a hug can independently convey emotions. Non verbal communication can also be in
the form of pictorial representations, signboards, or even photographs, sketches and paintings.
The written form of verbal communication has been extended from postcards to Electronic mail.
Electronic mail, most commonly abbreviated email or e-mail, is a method of exchanging digital
messages. The foundation for today's global Internet e-mail service was created in the early
ARPANET and standards for encoding of messages were proposed as early as 1973. In the start
e-mails would only contain text messages that can be send slowly attachments can be added into
the email like any multimedia attachment or any document etc. After e-mails it has been slowly
shifted to the peer - peer messages between two people called as the Instant Messaging. Instant
messaging (IM) is a form of real-time direct text-based communication between two or more
people using personal computers or other devices, along with shared software clients. The user's
text is conveyed over a network, such as the Internet. Instant messaging is often called as online
chat. Of importance is that online chat and instant messaging differs from other technologies such
as e-mail due to the perceived synchronicity of the communications by the users, online chat
happens in real-time. This online chat was first introduced by AOL instant messenger. Some
systems permit messages to be sent to people not currently 'logged on' generally called as offline
messages, thus removing some of the differences between IM and e-mail. In some of the IM even
3
included the voice over chat and also video chats. They also have the chat rooms where people
interact with different people and make new friends if like each other. They would be chat rooms
for the conversations about some topic. Those chat rooms would only belong to that topic they
would discuss the details of that topic and also clarify each others doubts.
Figure 2: (a) email sending (b) use of instant messenger
After the online chats of different services like AOL, Yahoo, Google etc. which are all time
favorites of the people to communicate with each other in a faster way. The inventions of some of
the websites which are maintained by the individuals are known as blogs. Many blogs provide
commentary or news on a particular subject; others function as more personal online diaries. A
typical blog combines text, images, and links to other blogs, Web pages, and other media related
to its topic. The ability of readers to leave comments in an interactive format is an important part
of many blogs. Most blogs are primarily textual, although some focus on art, photographs, videos,
music, and audio. Micro blogging is another type of blogging, featuring very short posts.
Similarly there are forums that are mostly used for the discussion purposes. Most of the forums
consist of technical discussion about a topic and there would be comments for that topic and the
discussion continues.
Blogs became a personal website where people post about their personal events and day to day
life. But what about the people who doesn‘t have the technical knowledge of creating blogs So,
after chat rooms, blogs it became the new age of networking known as the social networking.
4
1.1 Definition
But what is a social networking site?
A social networking site is a web based site where people create a public or a semi public profile
within a bounded system, share there views to the people with whom they wish to share and make
a list of connections with those who are present within the range of the bounded system. Profiles
are unique pages where one can "type oneself into being". There are just the profiles the social
networking sites contains but also photos, videos, communities and the most important thing is
the friends list. We can also say that a social networking site is a medium of communication
between friends.
1.2 History
The first recognizable social networking site was launched in 1997. SixDegrees.com allowed user
to create profiles, list their Friends and, beginning in 1998, surf the Friends lists. The features of
the SNS exited before in some or the other form. For example dating sites consisted of the
profiles but does not contain the friends list. Instant messengers consisted of friends list or buddy
list which would be visible to only to the user but not to others. Classmates.com allowed people
to affiliate with their high school or college and surf the network for others who were also
affiliated, but users could not create profiles or list Friends until years later. SixDegrees.com was
the first to combine these features [6].
After joining a Social Networking Site, an individual is asked to fill out forms containing a series
of different questions. The profile is generated using the answers to these questions, which
typically include descriptors such as name, age, sex, location, interests, and an "about me"
section. The person would also be asked to upload a photo if necessary or if he is wished to. After
joining a social network site, users are prompted to identify others in the system with whom they
wish to have a relationship. The label for these relationships differs depending on the site—
popular terms include "Friends", ―Relatives‖, "Contacts‖ and "Fans." Most of the Social
Networking Sites provide users to leave the message for their friend. For example in Orkut it is
called as a Scrap, Facebook posting a message on their wall and in Twitter tweeting about self
and posting comments on the friends tweets etc. But the social network sites are not limited to
messaging, creating profiles, joining friends etc. It also consists of uploading the individual‘s
photos and also the videos which they can share with their friends.
5
The profile visibility of the individual varies from site to site. For example profiles on Friendster
was visible to anyone who either a member of the SNS or a non member. LinkedIn is a social
networking site which is mostly used for maintaining business contacts or professional contact.
The profile of the LinkedIn can be viewed only by the persons who have an account in that SNS.
MySpace which was first created for as a SNS within a college slowly developed as a commercial
site for all the people. It also allows only the users who have an account in the SNS.
The public display of connections is a crucial component of SNSs. The Friends list contains links
to each of the Friend's profile, enabling the viewers to traverse the network graph by clicking
through the Friends lists. On most of the sites, Friends list is visible to anyone who is permitted to
view the profile, although there are exceptions. For instance, some MySpace users have hacked
their profiles to hide the Friends display, and LinkedIn allows users to opt out of displaying their
network. The default visibility in most of the social networking sites is kept as visible to everyone
or friends of friends. Since most of the users who are using these social networking sites are
teenagers they care popularity more than security, so they neglect the privacy settings which are
also given by the social networking sites. They think more the number of friends more the
popular they are, so they add more number of user and don‘t think whether he is a malicious user
or not. Some of the malicious users can be dangerous and misuse the private data. By making
more and more people adding into the friend list your profile will become more and more public.
1.3 Thesis Statement:
The major goal of this thesis is to provide the best privacy management schemes for the users
who are using the social networking sites. For providing these schemes on the social networking
sites we focused on two of the major problems that are present in the present day social
networking sites:
The first problem is to make sure that a user in the social networking site should know
the person whom he is adding into his friend list, whether the person is a stranger or a
friend. Since by adding a stranger to our friends list it can dangerous since our private
data will be in the hands of the persons who might be a malicious user and misuse our
data. The personal data includes our name, date of birth, address, phone number, interests
etc.
The second problem is finding out whether the person who wants to join into an online
community in the social networking sites really knows about the community is interested
6
in the community and if he interested then only the person can enter into the community
of the online social networking sites. By doing so we can eliminate the people who are
not interested in the communities and the unwanted mails (these mails can be spam
mails) from uninterested people will be reduced.
The brief introduction these problems are given in Section 1.4 and Section 1.5.
1.4 Let only the right one in
Security experts often say that users are the weakest link in a security system [26]. Users
misunderstand how to use security mechanisms and do not realize the need for such a protection.
User behavior is essentially goal driven and security is usually a supporting task. Users are happy
to circumvent the security measures, if security measures try to impede their primary tasks.
Attackers on the other hand are experts in usability: they exploit user‘s lack of understanding and
their tendencies not to comply with security protocols and policies by developing simple yet
effective social engineering attacks. This problem is inherent in social networks. Social
networking sites are highly popular especially in teenagers. There exist many dedicated social
networks for different domains. For example, LinkedIn, Live Journal is business related social
networking sites while sites like Flickr offer easy and public photo sharing. However, in most of
the cases, social networking sites are meant for teenagers to stay in touch with their friends.
Friendship dedicated sites, like Facebook4 and MySpace5, are among the most popular websites
with more than 150 million users each and a growth rate of 3% per week [10]. This growth
bounds to attract many malicious people. Social networking sites are a rich source of sensitive
private data about millions of users. If such a data gets into the hands of malicious people then
there could be serious side effects such as identity theft, privacy loss, etc.
However, the need of socializing and the benefits that social networking websites offer are so
high that users often ignore the associated risks. Current social networking sites like Facebook
protect user data by making it semi-public [6]. A Semi public profile is visible only to the
restricted set of people often friends. However, the definition of friend is rather illusory in a
social networking environment. Each person can have different kinds of friends. Most notable
categories of friends are: Direct or close friends, acquaintances and Internet friends. The third
category: Internet friends, is the most troublesome category. This category includes friends whom
we met only on the Internet and never met personally before. Therefore, the only form of
communication with these friends available is via Internet. It also means that, there is no easier
7
way to verify their true identity, so they are no different from a stranger. Innocent users never
realize that many of them could be malicious attackers trying to steal their important data or they
could be sexual predators, lawyers, principals, etc. impeding their social privacy. Therefore
adding such people (strangers) as friends, without verifying is not advisable. However, teens
(who represent the mass population on social networking sites) do not share the same risk
perception. Most of them think friendship as a loyal and harmless event. Moreover, it is often
difficult to say `no‘ than `yes‘ when it comes to accepting a friend request. It is also a common
belief among the teenagers that more the number of friends they have, the more popular they
become [13]. Thus the default choice is `accept most of the friend requests‘. As a result, many
users have thousands of friends while with most of them users hardly speak. Users do not realize
that by confirming someone as a friend, they give the person the power to secretly view all the
contents of their profile, without being aware of when and what they view [29].
We propose a novel verification paradigm to ensure that the person who sends you a friend
request is actually your friend and not someone who is faking her identity. Our solution is based
on what a person might know and can verify about the other person. We work on a premise that a
friend can tell about her friend‘s preferences better than the stranger. To verify our premise, we
did a two stage user study. Results of the user study are quite encouraging.
1.5 Sammelan
Humans are social animals. Society is an integral part of a human life. Humans have been solving
complicated problems that they face in life with the help of fellow humans in the society. In a
society all the people try to create their own comfort zones called communities where they can
share their interest and problems with other members of the community. Today Internet has
transformed itself into a powerful tool with limitless applications and benefits. Internet has
become a virtual world in itself with people making real-life like profiles of themselves. Online
communities have become very important in the world of internet. These online communities are
providing the same benefits of the real world communities with much more ease, flexibility and
scalability. The main difference between the benefits of the online communities and the real
world communities is that in online communities accountability for the deeds of the community
members is very low or sometimes nil. If our assumption is that all the members of the
community are honest, then there is no need for their accountability. But this assumption is not
applicable in real life. With increasing popularity and people‘s dependence on the online
8
communities there is an increasing immediate need to solve whether the users are interested in the
community or not.
Social networking sites such as Facebook, MySpace, and Orkut are becoming popular amongst
the teenagers [13]. According to a recent study conducted by USC Annenberg's Center for the
Digital Future, nearly 60 percent of internet users over 50 years log in to an online community
every day, or even several times a day. This number is just 47 percent of members who use social
networking sites and are under 20 years old [28]. Social networking sites are mainly used to
interact with friends who may be the old friends and some are the new friends. Facebook alone
consists of about 400 million active users [10] and this number is increasing day by day. The
uniqueness of a Social networking site lies in what it gives its users. It not only allows strangers
to communicate with each other but also enable them to articulate and visit their social networks.
This generally results in connections between the individuals who share some common interests.
Most of the social networking site users are not looking out to meet new people but they are
trying to communicate with the people who are already a part of certain group [6].
Social Networking Sites are all about connecting with the like-minded people. The sense of
community makes social networking sites popular. The interactions in the community will
increase much more if the type of the community is a closed one that is if the community has the
restrictions of its own privacy settings for the members. But along with the benefits of these
communities some dangers lurk, just as in real world [28]. For example, a person may keep false
data in his entire profile. The possible dangers can widely vary from showing false age, stealing
someone‘s personal information to cybercrimes like identity theft, masquerading and use of
public information for wrong purpose.
With the advent of web 2.0 services the importance of communities has been dramatically
increased in the lives of internet users. The full potential of the communities is getting explored
with online services like blogs and online writing. Social networking sites have become our
second homes. People have embraced the idea of being in an online family called community.
This new era of online communities brought great changes in the lifestyles of many. Apart from
the advantages users get out of these communities, the potential for abuse is also high.
An online community has to cater the needs of its members to maintain itself over time [2]. The
willingness of the community to have an active participation in the discussions and the
commitment of its members are important factors for the success of any online community. This
9
property of online communities is very much similar to real world communities. The benefits that
the users seek should be provided by the online communities to stay alive in the cyber world [8].
The members who post for the first time to an online community are more likely to contribute to
the community when other members respond to their posts. For these new members in a
community receiving a response to their posts increases the probability of posting a second time
[15].
Community Hacking [8] makes a discussion of the threats faced by the online communities.
Community hacking is defined as the art of breaking into online communities and bending their
function in unintended ways. Mostly community hacking is motivated by personal gain to the
hacker. This form of abuse has to be taken seriously as it can hurt the quality of content and the
interactions of the community.
Communities are great for organizing on a personal level and for smaller scale interaction around
a cause. Communities are set up for personal interactions. Communities are directly connected to
the users implying the activities of the group may map to your personality. Generally online
communities are considered as extension to the user‘s profile. All the posts of a user on a
community will be attached to his personal profile. Creating a community is quiet simple with
only the type and the name of the group as required fields. Permissions settings make it possible
for community moderators to restrict the access to that group. The administrator of the group has
to set the access permissions for the group. The viewing access to community can be open to all
or can be restricted to its members. The join permissions on the community can be set to public,
closed or secret. Administrators have the task of managing the community, sending invitations
and approving the applications. Because of the inherit structure of community it offers more
control over who gets to participate over other means of mass communication. Updating
information to a community is simple. Email to the community will appear in the inbox of all the
members of the community.
The problem with online communities is that it offers little control to the moderator of the posts.
Even though they have granular moderation features it is not sufficient. If a post is considered to
be a spam it has to be deleted manually. Everybody needs privacy in their life and so do the users
of the social networking sites. Present day social networking sites are facing problems in privacy.
These privacy issues should not only be confined to the user profile but also to the groups which
10
he has joined. As of today importance is given to the profile alone and ignoring the privacy
problems in communities.
A more serious problem faced by the social networking site users is with the Groups created
under false pretense [9]. These types of groups are being used for financial gains of the owner of
the group. Authenticating the owner of the group becomes quiet difficult. This can only be done
based on the moderator‘s knowledge about the group/community. The problem is not only with
the genuineness of the community but the control of the community rests with an inappropriate
administrator who could misuse the member‘s information.
Attackers consider online communities as information harvesting grounds. Even though privacy
settings of the user‘s profile keep the naïve attackers at bay, these settings are not sufficient to
stop smart attackers. There are many issues like privacy and spamming in the online communities
which needs to be taken care of. Once the attacker is a part of the online community, he has the
access to all the members who have interest in a particular topic. Maintaining high quality
activities by closely inspecting the posts of the community members is a tedious job. The same
result can be achieved by a simple task of judging the users who joins the community. The
classification can be done based on the knowledge he has about the community, his interest and
the amount of effort that he puts in to get into the community.
Our scheme works on the premise that a user who is interested in the community possesses some
knowledge about the community and can answer the basic questions about that community. We
verified our premise with the help of user study. On an average we found that there is a huge
difference in the scores of the interested and non-interested users of the community.
1.6 Overview of the Thesis
The remainder of the thesis is arranged as follows. This thesis is divided into two parts that
corresponds respectively to the two major problems that we have discussed in section 1.3. The
first problem is ―Let Only the Right One In‖ and the second problem is ―Sammelan: Secure
Communities of Shared Interests‖.
Chapter 2 discusses the first problem ―Let Only the Right One In‖.
The first section of the Chapter 2 discusses the background work of the first problem and the next
section discusses the related work that has been done till date on that problem. Section 2.3
discusses we mention what motivated towards the approach and also discuss our approach to the
11
problem, before presenting our method we also presented a naïve approach which we modified
for a better usability in Section 2.5 and Section 2.6. The final 3 Sections (Section 2.7, Section 2.8,
Section 2.9) of Chapter 2 gives detailed results of the user study and the results of the user study
that we have conducted to prove our method.
The second half of the thesis that is Chapter 3 explains the second problem ―Sammelan‖.
The first Section that is Section 3.1 discusses the related work that has been done till date on the
second problem. In Section 3.2 we mention what motivated towards the approach. Before
proceeding towards our approach we also discussed two different methods and also the
drawbacks of that approach and finally we have come to a method that can solve the community
membership problem in Section 3.6, Section3.7 and Section 3.8. In Section 3.9, Section 3.10 and
3.11 we discuss about the security analysis of the method in how many different ways an attacker
can attack and also we have given the detailed results of the user study that we have conducted to
verify the method and prove it.
Chapter 4 concludes the thesis and also mentions some of the future work that can be extended
from our work.
12
Chapter2
Let Only the Right One In: Privacy
management scheme for social
networks.
2.1 Background
We have been using computers for socializing from about a decade. It started with e-mails, then
moved to chat rooms and now it is the era of social networking sites. Primary reason behind the
success of social networking sites has been the range of socially compelling services that they
offer. Three most important of those services [12] are: Identity, Relationships and Communities.
Identity: Social networking sites allow user to create a public or semi-public profile
inside a bounded system. The profile pages include fields for personal details, favorite
forms of media and things that users want to tell about themselves to others. Biggest
advantage of profile pages is that it let users to say who they are and in a manner they
want.
Relations: Social networking sites let user to make new friends, rediscover old friends
with whom they had lost touch and stay in touch with the current ones. An average user
on Social network has around 120 friends [10]. Act of adding friend is often bidirectional
which requires mutual approval. After the friend request is accepted, user is able to
traverse accepted friend‘s connections. By browsing through user can meet or find new
friends. In this way, network of friends grows and is often measured in millions [10].
Communities: Third compelling factor about social networking site is the integration of
communities in social networks. Like minded people come together and start
communities on related subject of interest. The aim of the community is to share
thoughts, help each other and learn about the common community theme. Communities
range from public fan clubs of favorite celebrities to active forums discussing technical
subjects like C++, to social discussion on subjects like Poverty. Most communities are
13
kept public for anybody to join and learn about the subject. Users can achieve social
respect and importance within the bounded community with their sharing and thoughts.
According to Maslow hierarchy of human needs [21], the urge to sociality is highly motivating
force. Above social compelling services often make users to undermine (neglect) the associated
risks. A fully filled-out profile is a rich source of personally recognizable information. Such
sensitive data can often be misused once got into the wrong hands.
We discuss below, possible threats by adding a stranger as a friend.
Loss of privacy: Most of the user do not read the privacy policy and those who read do
not understand them. Often users assume that if there is a privacy policy then the site
safe [24]. User share their embarrassing moments, mistakes they made, their love life
and other sensitive personal details on social networks. These details are visible to the
friends. However, users do not recognize the person that they have added as friend is not
the one they thought before. He can be the one faking somebody‘s identity. He could be
school principal or future employee monitoring your profile for your character or he
could be a prankster from your college. In this way there is huge amount of privacy loss
once the data is public.
Identity theft: Most financial institutions rely on security questions as fallback
authentication, in case user has forgotten her password. These security questions are
based on User‘s personal life history, family background etc. However, in social
networks, this kind of sensitive information is freely available [25] for an attacker to
launch successful identity theft attack. However to view this data, either the profile must
be public or attacker must be in the restricted list of people (friends) that has access to
this information. A successful attacker‘s strategy is described in [4] where an attacker
fakes an identity of a person and infiltrates her friend‘s networks. Since her friends can
not easily detect the impersonation, adding such fake profile will leak their sensitive data
for misuse, often for an identity theft. Thus, once an attacker has successfully penetrated
into a network, she can access all the profiles and other data and use it for malicious
purpose.
Criminal Activities: Social networking sites are often used for criminal activities such as
sexual predating, kidnapping and asking for a ransom amount, etc. To launch such an
attack, a malicious person creates a fake profile for himself and sends the flattering
14
friend request to many teenagers in the local neighborhood. Most of the teenagers do
reply positively to this type of requests. Once the malicious person has been added into
the friends list of the teenagers, he tries to influence with his behavior and become closer
to him. Once he has extracted enough personal data such as permanent address, contact
numbers, etc. of the victim then he might reveal his/her true identity. The extracted data
can further be used for severe criminal activities.
Above attacks demonstrate severe risks associated with posting personal data on social networks.
So far, the best defense against these attacks is not to reveal any sensitive and personal
information on social network. However, this defeats the purpose of social networks, which is
healthy exchange of ideas across the connections. We below describe related countermeasures to
protect privacy.
2.2 Related work
Social networking sites are used for connectivity of friends. This connectivity between the friends
is based on the trust they have on one another. However, the growth of social networks has also
attracted malicious adversaries. In recent times multiple techniques have been implemented to
protect the data over the Social Networking Sites. First solution is based on identifying the honest
nodes [35, 34]. Honest nodes are the ones with good connectivity with the rest of the social
networks. However, as we described in earlier section, it needs one infiltration to damage entire
network of trust. Results show that users blindly accept the request from forged identity that is
already confirmed by their friends [4]. Another solution is based on Data Perturbation [31] used
to modify the user data such that it no longer represents the real individuals. Lucas et.al [19] has
proposed an encryption scheme that presents the user data in an encrypted form. However the
solution needs the distribution of encryption keys which is an overhead. Another similar solution
is by securing the social networking API through proxies [11]. A related solution to the privacy
problem is using the shared knowledge between the persons [30]. However, this solution requires
active involvement from both the parties to work. A person must define separate challenge
questions for every new person and the data to be protected. Another technique that is being used
is CAPTCHA [1]. A Captcha is a program to detect automated bots attacks and to differentiate
between a computer and a friend. A CAPTCHA based technique is proposed in [33].
This solution requires users to identity content (person‘s name) of an image. However, user‘s
appearances do change from time to time; therefore, it might be difficult even for a legitimate
15
user to identify the person in the image. To summarize, we briefly described the related attacks on
privacy and how related solutions tries to prevent them from happening. Before describing our
actual design, we would like to explain what has motivated us to use the approach we have
presented in this thesis in Section 2.3. After the motivation the chapter also includes the naïve
approach that can solve the problem and then came to the main method which was better than the
naïve approach in solving the problem.
2.3 Motivation
We first describe how friends are formed (or linked together) in a social networks. We follow
Facebook model, where it involves three basic steps as shown in Figure 3. For clarity, we name
the persons involved in interaction as Alice and Bob. The malicious user who is faking bob‘s
identity is called Mallory.
Figure 3: Steps following while adding friends in social network
Bob first sends a friend request to user Alice. A request contains profile summary of the sender
(Bob) which includes name of the sender, his profile photo, short summary and names of mutual
friends (if any). Alice before accepting the request can also see and verify user Bob‘s profile.
Alice can also talk with Bob by sending messages to which Bob can reply.
However, Alice hardly checks and verifies the profile of Bob. Alice normally bases her judgment
on the profile summary that comes with the request. Default tendency is ‗to accept‘ if Bob and
Alice share some mutual friends [4]. However, all these data is easy to fake as it can be mined
through web and public records. Thus, an innocent careless Alice can easily be tricked into
16
accepting a request from a fake profile. To mitigate such attacks, attacker‘s identity must be
verified. A possible way of doing it is with challenge response schemes.
2.4 Challenge response schemes
Challenge response schemes are popular means of fallback authentication in cases where users
have forgotten their passwords [16]. In the challenge response schemes, system verifies whether
the user knows (remembers) the answers to the questions mostly about their personal life e.g.
mother‘s maiden name, their pet‘s name, date of birth, etc. It has been believed that answers to
these questions are not available in public records and only the legitimate person knows the
correct answers. However, this assumption has recently been proved faulty [25, 23]. It has been
observed that, the challenge response schemes are particularly vulnerable against insider attacks,
i.e. family members, friends, ex-girl friends and acquaintances seems to know answers of many
such questions. Therefore, we ask the question:
―If family members and friends know answers of many personal challenge response questions
then why can‘t we use these questions to authenticate them instead of the user?‖
We can thus verify the person who sends the friend request (Sender) using Challenge Response
schemes in following two ways:
1) Ask the sender, the questions related to his/her own life.
2) Ask the sender, the questions related to the life of the person (receiver) to whom the friend
request has been sent.
In the next two sections we discuss the both the approaches and argue why the second approach is
better.
2.5 Naive Approach: Verify about the sender
Alice can ask Bob to prove his identity by asking him some challenge questions answers of which
only Alice and Bob know and no one else. If Alice is satisfied with the answers, she will accept
Bob‘s request else decline. Steps followed are summarized in Figure 4.
However, this solution will work only if the following two conditions are satisfied. 8 Let Only
The Right One In: Privacy Management Scheme for Social Networks
1) Question forming should be automated and should require minimal effort from Alice.
17
2) Answers to these questions should not be publically available.
Figure 4: Naive Approach of Verifying the Sender
However, this solution will work only if the following two conditions are satisfied.
1) Question forming should be automated and should require minimal effort from Alice.
2) Answers to these questions should not be publically available.
However, the naïve approach has problem with both the conditions. First, finding appropriate
challenge questions automatically for every other user (Bob in this case) is difficult. Alice must
constantly be involved in the process of question forming and then again for the verification. It is
certainly a big overhead for innocent user like Alice. A perfect solution should expect most of the
work from malicious Bob (actually Mallory who is impersonating Bob) than Alice. Secondly,
since the identity of Bob is already forged, (Mallory has successfully mined bob‘s information
and profile details), chances that Mallory might able to correctly answer some of the questions
from the mined data can not be ignored.
2.6 Our Approach: Verify About the Receiver
Alice can instead ask Bob to prove what Bob knows about her by asking him some challenge
questions answers about her life. If Alice is satisfied with the answers, she will accept Bob‘s
request else decline. Steps followed are summarized in Figure 5.
There are two distinct advantages with this approach. It requires minimal efforts from Alice‘s
side. Alice can prepare a set of challenge questions concerning her and ask a subset of them for
any friend request that comes.
18
Figure 5: Better approach of verifying the receiver
Thus there is no need to separately preparing the questions for new friend request and verifying
thereafter. Secondly questions forming can be automated to a certain degree. However, if can be
easily predicted that the answers to these questions can not be mined through searching public
records. We desire such a mechanism, or set of questions that are not available online. We
therefore, looked into preference based authentication schemes [14].
2.6.1 User verification using Preferences
We therefore, design our scheme around user preferences. Each person has unique set of likes and
dislikes for range of items. We strongly believe that a friend can tell about her friend‘s
preferences better than the stranger [3]. Our proposed scheme works in three simple steps. For
clarity, we use again the entities Alice and Bob and Bob wants to become a friend of Alice.
2.6.1.1 Steps:
1. Building preferences: Alice builds a list of preferences for number of different
categories. In this prototype, we choose, following categories: Sports, Movies, TV
Shows, Hobbies, Music, Video Games, and Food etc. A special category about
personality is also added using the big five personality dynamics [23]. The personality
traits of a person are generally known to family members and others who share a long
term relationship with the friend. We ask Alice about her preferences (likes and dislikes)
for number of different items belonging to each category. We save all these preferences
into a secure database. This step happens generally at the time Alice registers on a social
network.
19
2. Verification Test: When Bob tries to send Alice a friend request, we pick a random
subset of items from a preferences database of Alice and ask Bob to identity Alice
preferences for these items. Bob then tries to answer maximum of those questions.
3. Result: The results of Bob performance test is shown to Alice along with his Profile
history. It is up to Alice, thereafter to Accept or reject Bob‘s request.
2.6.1.2 Advantages
We list below, the distinct advantages of the proposed design.
User preferences are generally not available online [love]. Thereby probably safe against
data mining of public records.
The scheme gives minimal overhead to the person who receives the friend request (Alice
in this case).
The scheme can be easily automated once the preference database is formed.
The scheme is simple and easy to understand for users.
To test our approach we have done a user study and the results of the user that are shown in
Section 2.7.
2.7 User study
We test the viability of our approach we conducted a two phase user study. In the first phase, a
pilot study was conducted on a group of 75 student volunteers with their age in the range of 19 to
28. Out of the 75 participants, 49 were male while 26 were female. Monetary incentives were
provided to avoid the cold start. The questionnaire was prepared with eight categories namely:
Sports, Video Games, Music, Hobbies and interests, Food, Movies, TV shows and Academic
subjects of interest. Each category comprises of 12 to 14 items. Participants were asked to
respond with their likes and dislikes for each item within the given set of categories. In total, taste
performances for 134 items were reported.
The aim of the pilot study was to gather gender wise taste performances of the participants. In
particular, we were interested to know two things:
1) Commonly liked and disliked items and
2) Correlations among the liked and disliked Items.
20
The questionnaire was prepared with specifically college students in mind and is of demographic
in nature. Therefore some of the questionnaire can be changed to better suit the desired users. The
results of the taste performances were then used to prepare a questionnaire for the main (second
stage) user study.
Table 1: Most liked and disliked items for each category
Category Most Liked Most Disliked
Sports Cricket, Billiards Swimming, Boxing
Video games Age of empires Arcade
Interest Books, Computers, Politics Fashion designing, jewelry, religion
Music Rap Heavy metal
Food Fruit juices, Home food Italian
Table 1 shows the results of the most liked and disliked items belonging to each category. These
items are easily predictable since they are liked or disliked by most of the participants. We
therefore, eliminated them from the second stage questionnaire. For example, in the sports
category, 87.09% of the people liked cricket which was quite obvious since user study happened
in India while 85.62% of them dislike boxing which was also relatively good estimate with Indian
public.
Figure 6 shows the graph that has been splited into two parts where left part shows liking for each
item in the sport category and right part shows the liking for the items in the interest category. We
can observe that, items like volley ball and basket ball are liked equally by the participants.
Similarly items like sleeping and travelling got similar liked votes.
21
Figure 6: Liking for the items belonging to a) Sports category b) Interest category
To summarize, after the first stage, we were able to eliminate items that can be easily guessed.
The academics subject of interest category was removed since it was easier to guess based on the
participants background. We also eliminated 12 items from other categories. As a result, finally
42 items were removed from the first stage questionnaire, leaving back 92 items.
We then combined the items that have equal liking or disliking responses. Our motivation behind
doing it is to improve the usability as well as the security. To clarify it better, let us take an
example of items: chess and carom from sports category. We found that while 50.94 % of
participants liked chess while 48.16% participants loved caroms. Now if we combine these two in
a single question and ask user that which one or both among them you liked the most, then user
wont feel the burden of answering two questions and security is also improved in a sense
attackers can not easily identify whether the legitimate user likes one or both or neither of these
two items (security is improved from 2 bit to 4 bit).
22
2.8 Second phase user study
At the end of first stage user study, we were left with 92 items that after combining created 48
questions. We then introduced a new category ‗Personality‘, loosely based on the Big Five
personality factors [23] of a person into second phase.
This phase was conducted on a different set of volunteers which consisted of 32 volunteers out of
them 20 are male and 12 are female volunteers. These set of 32 volunteers has been chosen in
such a way that for each and every participant there would be a friend of the participant and a
complete stranger to the participant. The format of the user study was paper based. We asked
each user to fill three questionnaires one for herself, one for his/her friend and one for a stranger
whom she does not know personally. A sample set of questionnaire that was asked to the
participants is shown in the Table 2.
Table 2: Sample questionnaire for the second stage user study
Category Items
Interest
Dancing Or Singing
Travelling Or Sleeping
Reading Or Writing
Participants were free to choose either or both items as likes. Items that are not selected are
considered as dislikes.
2.9 Results
We collected the forms from all the participants and cross checked the entries written for the
friend as well as the strangers with their original answers. The analysis has been carried out in the
following manner: If the participant has guessed correctly whether both the items are liked or
either of the items is liked. Then we rewarded them 2 points but if is participant is able to guess
only one correct answer, i.e. if the friend or stranger has liked both the items of single row and
participant has guessed only one then we gave 1 point. If neither of the items is correctly guessed
then we gave no points for wrong guess. Our results show that a participant is able to correctly
23
guess about her friend for 45.86% time whereas she can only able to guess for 30.69% of a
stranger. There exists a distinct gap of 15.17% that distinguishes between a stranger and a friend.
We can see the results in Table 3.
Table 3: Guesses of participants about their friends and strangers
Total no. of participants Guess about Friends Guess about Strangers
32(20 male + 12 female) 45.86% 30.69%
After getting the total results of how much a person can distinguish between a friend and a
stranger. The next aim was to find out in which one of the categories there was much of the
differences in guesses for friend and stranger. Figure 7 which is a graph shows the difference of
percentage between the friends and the strangers in each category.
Figure 7: Prediction difference by participants for friend and a stranger
As we can see from the graph, the personality category shows the maximum difference of 21.43
% between a friend and a stranger. It was obvious that the personality of a person can be better
known to their friends than to a stranger. The next category that shows big difference in guesses
was the interests, where there exist 17.43% difference between a friend and a stranger.
In a similar manner, the two categories where guesses about friends and strangers collide are
music and the movies categories. We thus can say that these two categories are not good
estimates of differentiation among friends and strangers.
24
We may conclude that our theory of distinguishing friends from the stranger using like and
dislike preferences will work given that there exist a distinct gap of 15% or more in the guesses
for the friends and the strangers.
25
Chapter 3
Sammelan: Secure Communities of
Shared Interests
3.1 Related Work
To the best of our knowledge there is no prior work which discusses the community membership
problem. Community membership problem is to decide whether the user really belongs to the
community or to decide if he may become a good member in the community. Till date research
was focused on the privacy issues concerning personal data which is on the social networking
sites.
Challenge questions have become an important secondary mechanism of authentication [17]. This
is a form of knowledge response scheme which is being used for fallback authentication in case
the user forgets his/her password. The assumption in this scheme is that querying for known
information will be more useful than querying for memorized information. This paper shows that
alternative authentication schemes can be used in place of conventional authentication schemes.
Following this paper we can infer that knowledge testing mechanisms can be used for
membership problems.
Cliff et.al in [18] proposed the scheme to create a neighborhood for the user. In this scheme
profile elements are used as signals. In this paper the author empirically proves that certain profile
fields are more prominent to predict friendship/relationship. These profile fields were termed as
signals. Based on these signals a neighborhood is defined for a user. This same neighborhood
may be used to define a group or a community for the user in vague terms. This kind of
community is personal or customized to the user. The paper does not talk about the application of
this scheme to the groups /community membership problem.
Recommendation systems and social navigation systems can be of great help to the user by
narrowing down the wide variety of choices he has and suggesting the navigation path that
26
matching his profile. Incidentally the same helpfulness does not come to the rescue of the
community/group owners.
Philip et.al in [5] proposed a scheme to improve online SNS recommendation systems. In this
paper profile similarities like age, gender, profession, hobbies were used to cluster together the
users having same interests and same tastes. Rating overlaps were considered as training model
for prediction. Feed back in the form of a questionnaire was taken from the user to correct the
prediction model. This work may be extended to recommend groups to the user but the author did
not explore this possibility.
Michael et.al in [30] discusses about the shared knowledge scheme to allow a user to be a part of
a certain group. The observation made by them is that social cliques will have similar regions of
knowledge. The paper explores the possibility of using this observation to create ad-hoc groups or
communities restricting the resources to a specific section of users. Guard questions were
designed for each resource which acts like a lock to that resource. All the users who have the
answer will form a virtual group/ community with access privileges to that resource. In this
method the owner has to put in the effort of designing questions for each resource. It is difficult to
deploy this method on large scale because the effort and complexity of designing the guard
questions will also scale with the system.
The user‘s trust in other community members and the community‘s information sharing norms
has a negative impact on the community specific privacy. Oden and sunil in [22] talk about the
effect of privacy on the participation of the user in online communities. The paper says that the
user‘s privacy concern is inversely proportional to the amount of contribution that he makes and
this makes the user more restrictive.
Spamming has taken new forms in the recent times. Online community users are becoming the
new victims of this trend. Social networking spam is directed at users of internet social
networking services such as MySpace, FaceBook or LinkedIn [27]. Messages with embedded
links to other commercial sites or other SNS are an increasing nuisance. These spammers are
utilizing the tools of SNS or users of certain groups to post their spamming messages. All this is
done by the spammers who impersonate themselves as legitimate users. With some social
networking sites giving permissions to deploy user developed applications on the site, spammers
are using this feature to build applications which collect the information from the user‘s profile
and spam their inboxes.
27
There is an alarming rise in attacks on users of social networking sites by spammers and malware
writers. Reports show that the focus of the cyber criminals has changed to the social networking
users in the form of spam and malware propagation [20]. There has been a rise of 70 percent in
the users reporting spam through their social networking sites and a 69 percent raise in users
complaining about malware attacks from social networking sites.
The likelihood of infecting a computer using social networking sites as medium is much more
than conventional methods. When compared to the other methods the social networking way of
spamming was 10 percent more effective [32]. The same article says that exploiting social
networking sites is not new. Reportedly 350,000 spam mails flooded inboxes claiming to be from
Facebook recent times. These mails had malware attachments which tried to compromise the host
systems.
3.2 Motivation
Social networking sites contain a wealth of information. Attackers seriously consider online
communities as information harvesting grounds. Even though privacy settings of the user‘s
profile keep the naïve attackers at bay, these settings are not sufficient to stop smart attackers who
exploit the freedom given in an online community. Profile information leakage from the
community is not being considered as a serious threat by the users. There are many issues like
privacy and spamming in online communities which needs to be taken care of. Once the attacker
is a part of the online community, he has access to all the members who have interest in a
particular topic, which can be used by him for financial gains. In the form of discussions and
questions the attacker can gain more information and interests from the users of the community,
fine tuning his attack. These risks and threats involved grow linearly with the size of the
community. The quality of the interactions in the community is determined by the quality of the
members in the community. Activity in an online community can be kept high or maintained well
with in the interest ranges of the users by closely inspecting the posts of the members. This is a
tedious job to do. This job can be made easy by shifting the inspecting window from checking the
posts to judging the users that join the online community. All this discussion boils down to a
single problem, the community membership problem. To justify whether a user really belongs to
a community or not is the task to do. The problem can be extended to the users who are willing to
be a part of the community but are not a part of that community. The classification can be done
based on the knowledge he has about the community or based on his interest and the amount of
28
effort that he puts in to get into the community. Primarily our focus is on the communities which
are closed. The definition for the term ‗closed communities‘ is given in the next section.
3.3 Procedure
In this section we will explain various ways to handle the community membership related
problem.
3.3.1 Traditional Approach
This section will describe the general procedure to create a community in the social networking
sites. We follow the Facebook model in this case.
This section will describe the general procedure to create a community in the social networking
sites. We follow the Facebook model in this case. Only a member of the social networking site
can create a community. In Facebook a community is termed as ―Group‖. The user will be given
an option to create the group in the Groups section. He will be prompted for the group name and
the category to which the Group belongs to. The moderator should give a small paragraph
describing the Group. This is useful for the new users who wish to join the group. For example if
a user is a fan of Michael Jackson and wants to create a Group, he can give the group name as
‗Michael Jackson Fans Club‘ and has to give a brief description about the group. Under the
Group type he has to choose ‗Entertainment‘. The sub-category for this Group can be chosen as
‗celebrity‘. Further the user can give any links that belong to the community. The owner has to
create a public face (profile) for the Group by giving his personal details. A public face is a
stripped down version of his actual profile. He could be the moderator himself or make any other
user the moderator for the Group. In the privacy settings of the group the moderator can choose
the group to be ‗open‘, ‗close‘ or ‗secret‘. The definitions for open, closed and secretive groups
are as follows:
Open: Anyone can join and invite others to join. Group info and content can be viewed
by anyone and may be indexed by search engines.
Closed: Moderator must approve requests for new members to join. Anyone can see the
group description, but only members can see the Wall, discussion board, and photos.
Secret: The group will not appear in search results or in the profiles of its members.
Membership is by invitation only, and only members can see the group information and
content.
29
The traditional approach for a user in social networking sites to join into a community is very
simple and is explained in steps in Figure 8.
A user of the social networking site who wants to join into the Group has to send a
request to the moderator of the Group.
In this case if Alice wants to join the Group then she has to click ―join the community‖
which makes the Alice send a request to the moderator of the community.
The moderator can either accept the request or reject it. The decision solely depends
upon the moderator. There is no procedure involved to decide, it‘s simply accepted or
not.
Figure 8: Traditional approach of joining an online community
It is very easy for the attacker to get into the community and misuse the resources of the
community when traditional approach is employed. In this case Alice, who is a user of the social
networking site, may be a malicious person trying to exploit the community. To minimize the
threats of the community, modification has to be done to the traditional approach. Introducing
changes in the process of creating the Group and joining the Group are required. The aim of the
modification is to make it difficult for the attacker to enter into the Group.
But the difficulty of the moderator is to decide if the user has to be accepted or rejected into the
community and how to judge the user. If Alice sends a request, it is very difficult for the
moderator of the group to justify himself if he is adding correct person into the community or not.
One way to solve this problem is to use challenge response scheme.
3.4 Challenge Response Scheme
Challenge response schemes are used for fallback authentication which is used in cases where
users forget their passwords [16]. In the challenge response schemes, system verifies whether the
30
user remembers the answers to the questions which were asked during the time of registration to
the social networking sites like Facebook or Orkut. The questions are mostly asked about the
personal life of the user e.g. mother‘s maiden name, their pet‘s name, date of birth, etc. It has
been believed that answers to these questions are not available in public records and only the
legitimate person knows the correct answers. This method is the basis for our approach. If a
person is interested/belongs to a Group then he should be able to clear the Challenge given to
him. The scope of this challenge will be confined only to the Group.
The Challenge Response schemes for the verification of the user can be done in the following
ways:
1. Ask the User questions related to the community he/she wants to join.
2. Ask the user a set of ‗keywords‘ that belong to the community and verify his answers.
We describe both of the above approaches and discuss which one is a better approach.
3.5 Naive Approach 1
In this approach the moderator has to create a Group similar to the traditional approach. But after
creating the Group the moderator has to design a set of questions which are based on the Group
and are related to the Group. If a user wants to join a Group, in this case Alice, then she has to
follow the following steps. This is shown in ‗Figure 9‘.
Alice first sends the request to ―join the Group‖.
Alice would receive a set of questions that the moderator has prepared during the time of
creating the community.
The questionnaire would be a random set of questions that have been picked from the
total number of questions prepared by the moderator. Let‘s say 15 questions were chosen
from a set of 50 questions.
Alice will reply to the questions given.
The moderator has to validate the answers for the questions posted.
The moderator then replies to the user with either ACCEPT or REJECT based on the
answers to the questions.
31
Figure 9: Naive approach 1 for adding a user into the community
One of the problems in this approach is that the moderator has to correct the answers of each and
every user. The complete burden of validating the answers is on the moderator. This task may
seem to be practical and simple at the first glance but as the number of user requests increases this
will turn out to be an unimaginably tedious task to do.
For example let us take a Group of ‗Mahatma Gandhi‘. If Alice wants to join the Group then she
would get the questions such as
Q1) What is the name of Mahatma Gandhi‘s mother?
A1) ______________
Q2) Mahatma Gandhi was born on
A2) ______________
Q3) In India, Mahatma Gandhi is popularly known as
A3) ______________
Alice will reply to the questions. Moderator will validate the questions. If the moderator feels that
Alice has faired well enough in the challenge with good number of correct answers and if
moderator thinks that Alice is eligible to join the Group, he will send an ACCEPT message to
her. If the moderator thinks otherwise he will reply Alice with a REJECT message.
32
3.6 Naive Approach 2
To overcome the problem of correcting the answers every time we have used the approach of
checking the answers automatically. The steps involved are as follows:
The moderator will give a set of questions and their respective answers at the time of creation of
the Group. This procedure is explained in ‗Figure 10‘.
Alice first sends the request to ―join the Group‖.
Alice would receive a set of questions that the moderator has prepared during the time of
creating the community.
The questionnaire would be a random set of questions that have been picked from the
total number of questions prepared by the moderator.
Alice will reply to the questions given.
The system will validate the response of Alice. Responses will be considered to be
correct only if answers match exactly with the moderator‘s answers.
If the number of correct replies from Alice exceeds a predefined threshold T, the system
will reply to her with ACCEPT message or else REJECT message is sent to her.
Figure 10: Naive approach 2 for adding a user into the community
33
Let us take a Group, ‗Cricket‘. The following questions designed by the moderator can be taken
as example:
Q1) Who has taken the highest number of wickets in Test Cricket?
A1) _____________________
Q2) Who is the only batsman having the ‗highest average‘ in both test and ODI forms of the
cricket?
A2) _____________________
Alice will reply with the answers. Let us assume that Alice replied with the following answers:
Q1) Is ―Muttiah Muralitharan‖ and Q2) is ―Sir Donald Bradman‖.
After answering to these questions if number of correct replies exceeds the threshold T, Alice
passes the test. She would be added into the Group with ACCEPT message. The structure of the
answers given by the users may differ considerably with the structure of the moderator‘s answer.
This difference in the structure of the answers could just be in the syntax. In the above example if
the user replies either with ―Muralitharan‖, ―Muttiah‖ or ―Mularidaran‖ the answer is valid. But
any reply other than ―Muttiah Muralitharan‖ is considered as an invalid answer by the system
because of its syntactic difference. Instead of checking for the semantic similarity the system is
facing difficulty checking for the syntactic similarity.
To eliminate this problem we designed the set of questions with multiple choice answers. For
each question we gave 4 options and one among these 4 will be the correct choice. This method
greatly reduced the problem faced by syntactic differences in the answers but the effort of the
moderator/ Group‘s owner was considerably increased. Not only he has to design the questions
but also has to design equally good answer choices for the questions. The problem with the bad
answer choices is that the user may find out the correct answer by elimination. So the answer
choices should be equally complicated for the user who doesn‘t know the answer but a user who
knows the answer should be able to pick them easily.
To reduce the complexities involved in the multiple choice format questions we further improvise
the method. The improvised method is discussed in the next section.
34
3.7 Sammelan
In our approach the social networking site user who wants to create a community would be asked
the same details as in the traditional approach like name, category, subcategory and description.
Instead of designing the questions and answers for testing the knowledge, the user has to give a
set of KEYWORDS at the time of creation of the community. This is the basic difference of this
approach from the other approaches. The keyword for the community is a word which
characterizes the community or a word which is closely related to the community. The keywords
can vary from a small word to a sentence. In general a keyword is mentioned in one or two words
which describes about that community.
For example: Let us take a community ―C++ queries‖. This community comes under the category
‗education‘ or ‗computer and internet‘. The category under which a moderator places the
community is entirely up to him. After placing the community under a certain category he would
be asked to give the keywords for the community that will describe the community in the best
possible way. These keywords should be random words for a user who doe not know about C++.
The keywords would be of like ―OOPS‖, ―PURE‖, ―STRASTRUOP‖, ―MALLOC‖, ―NEW‖, etc.
these keywords can only be known to a person who knows C++.
This approach is explained in ―Figure 11‖. If Alice wants to enter into the community, let us say
she chose the community ―C++ queries‖.
She sends the request to join the community.
Alice would be presented with some random set of keywords having both the keywords
that belong to the community and the keywords that does not belong to the community.
About 30 keywords would be available to Alice and out of these 30 she has to pick the
correct keywords.
Alice will submit the chosen keywords to the system and the system would automatically
check for the correct keywords.
If the number of correct keywords from Alice exceeds a predefined threshold T, the
system will reply to her with ACCEPT message or else REJECT messages is sent to her.
35
Figure 11: Our Approach for a User to join into the community
3.8 Method
In this section we elaborate on the method that we used for our improvised final approach. A
method to test the knowledge of the new user and his willingness to join the community will be
described.
Social networking sites like Facebook, MySpace, LinkedIn, Orkut etc. have lots of communities.
Some of them are private, few of them are secret and the rest are public. The privacy concerns of
the moderators of private and secret communities are high. The moderators of a public
community publish the posts of their community openly. To maintain the integrity among the
members and to maintain the quality of the posts of the private communities, moderator has to
take extra care of the members who join his community.
Many time users join a community just by getting attracted by the exciting name of the
community. These types of users are neither interested in the activities of the community nor do
they know about the community. Our basic idea is to filter out the users who are not interested in
the community. This filtering is done at the time when the user tries to join the community.
The following are two types of users who would send the request to join a community.
A user who is interested in the community.
A user who is not interested in the community.
Here the underlying assumption for this classification of the users is that there is a considerable
difference in their knowledge levels about the community. Our model works on the knowledge
based authentication scheme. The model of knowledge based group is as follows.
36
3.8.1 Moderator’s side
The moderator creates the community by giving it a name, category and a brief description.
Moderator should create at least 25 keywords/tags that belong to the group.
The more the number of keywords the more it is difficult for an attacker to guess.
He has to take care that the keywords are semantically disjoint.
―Figure 12‖ is a screenshot of the above described steps.
Figure 12: Moderator creating the community and entering keywords
3.8.2 User’s side
The user who wants to join the community is given a grid of 30 keywords.
Out of the keyword only 7 belong to the community. The rest of the keywords are random
keywords which belong to the other communities of the same category.
The user has to choose the correct answers.
37
―Figure 13‖ is the screenshot of the above described steps.
Figure 13: User choosing the keywords corresponding to the community
3.8.3 System’s side
The threshold T of correct answers for any community that should be given will depend on
the degree of flexibility that the moderator desires.
We took threshold T in between (TA-5) % and (TA+5) % as the secure zone for the threshold.
Where TA is the average score of all the users in the user study.
For a user who is not interested in the group it would be very difficult to enter into the community
clearing the knowledge based test. If the user is really interested to join the community then he
has to learn the basics about the community and pass the test. The threshold T is taken near the
average scoring value of a particular community it is adaptive to the changes in the scoring value
of all the members of the community.
3.9 Security Analysis
Communities are vulnerable to the attacks of the malicious user. Recently attackers have turned
to the communities of the social networking sites as there are easy to get into and many of the
users are least concerned about the attacks in the communities. Majority of the social networking
site users are unaware of the threats in the communities.
In Social Networking Sites, a malicious user of the community is a user who tries to exploit the
resources of the Community or disturbs the activities of the community. The disturbance can be
in the form of spam messages, propagation of malware, interruptions and random posts in the
discussions, sending false data etc. For Example: All the community members might receive mail
38
from attacker saying that ―Hi, you have won a lottery of $1000. Please click the below link to
complete the procedure.‖
Attackers can be broadly classified as:
1. Passive Attacker: A passive attacker is a stealthy attacker who joins the community and
steals the user‘s data if the user‘s profile is kept public. He can study the trends in the
communities by checking the kinds of discussions that are going on. It is difficult to
detect a passive attacker.
2. Active Attacker: An Active attacker after joining the community will disturb the
community regularly in the form of spam messages, posting malicious links and
uploading malicious applications.
For any kind of attacker the first thing that he needs is the membership of the community.
Whenever an attacker wants to join the community, our approach says that he has to choose the
correct keywords. Attacker can try to compromise the knowledge test of the community which is
built on the basis of our approach. Explained below are the different ways the attacker may try to
bypass the test:
The attacker may guess the correct keywords by elimination. He may not know keywords of
the joining community but may know that the other randomly given keywords do not belong
to the joining community.
This attack depends on the attacker‘s diverse knowledge. The randomly given keywords were
limited to the same genre of the joining community thus minimizing this attack considerably. The
purpose of introducing random keywords from the same genre was to create confusion in the
attacker who is trying to guess the correct keywords by elimination.
Another technique by which the attacker can try to crack the knowledge test of the
community is by guessing all the keywords given to him.
In our implementation we have not given a chance for the attacker to choose all the keywords.
We have asked all the users to choose not more than required keywords.
Another kind of attack is where the attacker tries to guess the keywords given the limitation
to choose only certain number of keywords.
39
This guessing attack is similar to brute force attack. The probability that the attacker will succeed
to break the knowledge test is very low.
From the above discussion we can infer that bypassing the community knowledge test is difficult
and the communities based on our approach are safe from the attackers.
3.10 User Study
We conducted a User Study to prove the correctness of our proposed method for communities.
For the user study we took about 54 volunteers out of them 20 were female and 34 male
volunteers. The educational and professional background of the participants was diverse. The age
group of these volunteers varied from 19 years to 31 years. Monetary incentives were provided
for the pilot study to avoid the cold start. The aim of the User Study was to statistically measure
the strength of our hypothesis. As a part of the user study we took feed back from the user about
their opinion on the system‘s usability.
The procedure of the user study involved three sessions which were completed in a single day by
the participant. The gap between the sessions was of one hour. The details of the sessions are
shown in Table 4.
Table 4: Details of the sessions
Session 1 Training and community setup
Session 2 Test
Session 3 Questionnaire
We first mailed the web based prototype to the participants. Detailed instructions on the usage of
the system were given in the mail. In the first session, few randomly selected users (30
participants) created communities by adding a name, brief description and list of keywords.
Adding the keywords was on personal basis. However, we filtered the keywords to ensure that all
of them are valid and make sense for the particular community.
In the second session, all the participants took the knowledge test by picking two communities:
one in which he is interested in and another community in which he is not particularly interested
in.
40
Figure 14: Graph showing difference in average scores of Interested and not interested
users
The keywords of the community were asked in a format of a quiz to user, so that even the user
would be interested to know how much knowledge is he having in a particular community. The
user could only select 7 keywords out of the given 30 keywords. Figure 14 shows the difference
between the interested and not interested people.
3.11 Results
All the participants successfully completed the above mentioned sessions. On an average, users
interested in the community answered the knowledge test with 73.42% accuracy. The average
accuracy of the users who were not interested was 36.74%. There was a clear 36.68%
demarcation in the scores of the interested and not interested users. These results are shown in
Table 5.
Table 5: Scores of the knowledge test
User Interested in community?
Knowledge Test Score
Yes 73.42%
No 36.74%
0
10
20
30
40
50
60
70
80
Interested Not Interested
Interested
Not Interested
41
In addition, existing loyal users of the community were asked to enter more related keywords.
The reply to this request was optional. Reputation of the user was judged by the quality of his
posts and his contribution to the community.
If the user has given the keywords those keywords would be combined with those of the other
keywords and would be added to the database of the keyword. So we are even asking the user to
share the knowledge which he knows about the group which others might not know in the group.
Figure 15: Graph showing the differences in the user's scores in knowledge test
Since by sharing knowledge and having the interactions regularly the group would be active.
Every person who wants to join the group would get a random set of keywords which would vary
as much as the number of keywords in the database increases.
3.12 System Usability Test
We performed a usability study based on the system usability scale. The System Usability Scale
(SUS) is a simple, ten-item scale giving a global view of subjective assessments of usability.
According to the standards suggested by ISO 9241-11 the measures of usability should cover the
following:
Effectiveness.
Efficiency.
User Satisfaction.
0
20
40
60
80
100
120
1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52
42
System usability scale is a Likert scale. In this scale a statement is made and the user is asked to
give score for the statement ranging from 1 to 5. Scores for the statements were 5 point scale
options which represented the degree of agreement or disagreement of the user. The opinion of
the user can be from strongly disagree to strongly agree. The scores used for the opinion of the
user are given below:
1. Strongly Disagree.
2. Disagree.
3. Neutral.
4. Agree.
5. Strongly Agree
A set of 10 standard questions were asked to the users to evaluate the usability of the system in
the form of feedback. The usability score (US) of the system is calculated using the formula
below:
Where
The study was done on 54 users. We got the System usability score of 67.52 out of 100. We
consider that this score is good enough for a first time user. Figure 9 shows a 3-point scale
plotting of the opinion results on the system given by the users.
Figure 16: Bar graph showing opinion of users on 3 point scale in System Usability Test
43
Figure 16 shows the answers of some of the questions that the user have answered and how many
users have agreed that Sammelan is easy to use and also it how many people are confident of
using Sammelan. Means they feel more secure after using Sammelan.
44
Chapter 4
Conclusion
The popularity of social networking sites is rapidly increasing day by day and also the privacy
concerns in the social networking sites are increasing. The main aim of the thesis is to work on a
few of the many privacy issues that are present in the social networking sites. In the thesis we
have shown two main privacy problems that social networking sites are facing in the present
situation.
The first one is eliminating the strangers from the friends list or in other words we can
say that only one‘s close friends would be added into one‘s friends list.
The second problem is on the communities and how the community membership is given
to a user. The second problem describes about how only interested people can be added
into the community and others can be neglected. Since the users who are not interested
would be inactive in the community.
The first problem is eliminating the strangers from adding themselves into the friends list or
differentiating between the strangers and the friends. The most important part of the social
networking site is the Friends we choose to add into our friends list. The privacy of our data will
be dependent on what type of friends we choose. If we choose the friends who are well known to
us then our personal data is safe, but if a stranger comes into the friends list then our personal data
can be misused. For this we investigated a novel idea of verifying friends from strangers using a
challenge response schemes. In addition, we have used the big five personality traits to make it
more difficult for a stranger to identify the user. We have showed the consequences of adding a
stranger without verifying and possible applicable countermeasures. Results of the user study
show that our proposed approach provides a viable option to privacy management in social
networks.
The second problem discusses about the type of people who are joining the communities in social
networking sites. The part of communities in the activities of social networking sites is very
crucial. Since communities reflect the user‘s personality and characterize him, security and
privacy in online communities is of prime importance. To solve the community membership
45
problem, we use a knowledge based test. We have shown that people who do not belong to the
community can be filtered effectively. Our results show that the scores of knowledge test of
interested and non-interested users are substantially far too off enable us in differentiate them. In
future one may try to modify the approach as to further reduce the effort of the moderator in
creating the communities.
One can extend our present work for the first problem by differentiating between the friends or
the type of friends. For example: what type of friends are they such as school friends, college
friends, colony friends, colleagues etc.
For the second problem a possible extension would be to automate the process of keyword
generation in the community creation process. The effort of the attacker to bypass the test is not
quantized in our process. It is worthwhile to quantize the same.
46
References
[1] Ahn,L. von, Maurer,B. McMillen, C. Abraham, D and M. Blum. reCAPTCHA: Human-Based
Character Recognition via Web Security Measures. Science, September 2008.
[2] Arguello, J., Butler, B. S., Joyce, E., Kraut, R., Ling, K. S., Rosé, C., and Wang, X. 2006. Talk to me:
foundations for successful individual-group interactions in online communities. In Proceedings of the
SIGCHI Conference on Human Factors in Computing Systems (Montréal, Québec, Canada, April 22 -
27, 2006). R. Grinter, T. Rodden, P. Aoki, E. Cutrell, R. Jeffries, and G. Olson, Eds. CHI '06. ACM,
New York, NY, 959-968. DOI= http://doi.acm.org/10.1145/1124772.1124916.
[3] Baron, AR and Byrne,D. Social Psychology. Eight edition. Prentice-Hall India.
[4] Bilge, L., Strufe, T., Balzarotti, D., and Kirda, E. 2009. All your contacts are belong to us: automated
identity theft attacks on social networks. In Proceedings of the 18th international Conference on World
Wide Web (Madrid, Spain, April 20 - 24, 2009). WWW '09. ACM, New York, NY, 551-560. DOI=
http://doi.acm.org/10.1145/1526709.1526784.
[5] Bonhard, P., Harries, C., McCarthy, J., and Sasse, M. A. 2006. Accounting for taste: using profile
similarity to improve recommender systems. In Proceedings of the SIGCHI Conference on Human
Factors in Computing Systems (Montréal, Québec, Canada, April 22 - 27, 2006). R. Grinter, T.
Rodden, P. Aoki, E. Cutrell, R. Jeffries, and G. Olson, Eds. CHI '06. ACM, New York, NY, 1057-
1066. DOI= http://doi.acm.org/10.1145/1124772.1124930.
[6] Boyd, Danah M. & Ellison, Nicole B.: Social Network Sites: Definition, History, and Scholarship In:
Journal of Computer-Mediated Communication , Vol. 13 , Nr. 1 , October (2007) , S. article 11 .
[7] Butler, Brian S. Membership Size, Communication Activity, and Sustainability: A Resource-Based
Model of Online Social Structures. In: Journal of Information Systems Research, Vol. 12, No. 4,
December 2001, pp. 346-362 DOI: 10.1287/isre.12.4.346.9703.
[8] Community Hacking - A New Threat to the Internet. http://hubpages.com/hub/Community-Hacking-
social-bookmarking-sites-online-communities-social-networks.
[9] Facebook‘s New Privacy Problem: Groups Created Under False Pretenses.
http://blog.searchenginewatch.com/081222-080553.
[10] Facebook Statistics. http://www.facebook.com/press/info.php?statistics.
[11] Felt, A., and Evans, D. Privacy Protection for Social Networking Platforms. In Proceedings of Web 2.0
Security and Privacy. Oakland, CA. 22 May 2008.
47
[12] Grimmelmann JT. Facebook and the social dynamics of privacy. In: Iowa Law Review 95(4), May
2009, to appear. http://ssrn.com/abstract=1262822 (last access 17 October 2008).
[13] Gross, R., Acquisti, A., and Heinz, H. J. 2005. Information revelation and privacy in online social
networks. In Proceedings of the 2005 ACM Workshop on Privacy in the Electronic Society
(Alexandria, VA, USA, November 07 - 07, 2005). WPES '05. ACM, New York, NY, 71-80. DOI=
http://doi.acm.org/10.1145/1102199.1102214.
[14] Jakobsson, M., Stolterman, E., Wetzel, S., and Yang, L. 2008. Love and authentication. In Proceeding
of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in Computing Systems (Florence,
Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 197-200. DOI=
http://doi.acm.org/10.1145/1357054.1357087.
[15] Joyce, E and Kraut, R. Predicting Continued Participation in Newsgroups. Journal of Computer-
Mediated Communication, 11(3):723--747, 2006.
[16] Just, M. 2004. On the Design of Challenge Question Systems. IEEE Security and Privacy 2, 5 (Sep.
2004), 32-39. DOI= http://dx.doi.org/10.1109/MSP.2004.80.
[17] Just, M. and Aspinall, D. 2009. Personal choice and challenge questions: a security and usability
assessment. In Proceedings of the 5th Symposium on Usable Privacy and Security (Mountain View,
California, July 15 - 17, 2009). SOUPS '09. ACM, New York, NY, 1-11. DOI=
http://doi.acm.org/10.1145/1572532.1572543.
[18] Lampe, C. A., Ellison, N., and Steinfield, C. 2007. A familiar face(book): profile elements as signals in
an online social network. In Proceedings of the SIGCHI Conference on Human Factors in Computing
Systems (San Jose, California, USA, April 28 - May 03, 2007). CHI '07. ACM, New York, NY, 435-
444. DOI= http://doi.acm.org/10.1145/1240624.1240695.
[19] Lucas, M. M. and Borisov, N. 2008. FlyByNight: mitigating the privacy risks of social networking. In
Proceedings of the 7th ACM Workshop on Privacy in the Electronic Society (Alexandria, Virginia,
USA, October 27 - 27, 2008). WPES '08. ACM, New York, NY, 1-8. DOI=
http://doi.acm.org/10.1145/1456403.1456405.
[20] Malware and Spam rise 70% on social networks.
http://www.sophos.com/pressoffice/news/articles/2010/02/security-report-2010.html.
[21] Maslow, A.H. A Theory of Human Motivation, Psychological Review 50(4) (1943):370-96.
[22] Nov, O. and Wattal, S. 2009. Social computing privacy concerns: antecedents and effects. In
Proceedings of the 27th international Conference on Human Factors in Computing Systems (Boston,
48
MA, USA, April 04 - 09, 2009). CHI '09. ACM, New York, NY, 333-336. DOI=
http://doi.acm.org/10.1145/1518701.1518754.
[23] Oliver P. John and Srivastava, S. The Big-Five Trait Taxonomy: History, Measurement, and
Theoretical Perspectives. In Handbook of Personality: Theory and Research (1999), pp. 102-138.
University of California at Berkeley.
[24] People Don‘t Read Privacy Policies… But Want Them To Be Clearer.
http://www.techdirt.com/articles/20090216/1803373786.shtml.
[25] Rabkin, A. 2008. Personal knowledge questions for fallback authentication: security questions in the
era of Facebook. In Proceedings of the 4th Symposium on Usable Privacy and Security (Pittsburgh,
Pennsylvania, July 23 - 25, 2008). SOUPS '08, vol. 337. ACM, New York, NY, 13-23. DOI=
http://doi.acm.org/10.1145/1408664.1408667.
[26] Sasse, M. A., Brostoff, S., and Weirich, D. 2001. Transforming the 'Weakest Link' — a
Human/Computer Interaction Approach to Usable and Effective Security. BT Technology Journal 19,
3 (Jul. 2001), 122-131. DOI= http://dx.doi.org/10.1023/A:1011902718709.
[27] Social Networking ―SPAM‖. http://en.wikipedia.org/wiki/Social_networking_spam.
[28] Social Networking Sites for Grown–ups.
http://www.symantec.com/norton/products/library/article.jsp?aid=social_networking_sites_for_grown
_ups.
[29] The psychology of Facebook
http://www.charlatan.ca/index.php?option=com_content&task=view&id=20014&Itemid=151.
[30] Toomim, M., Zhang, X., Fogarty, J., and Landay, J. A. 2008. Access control by testing for shared
knowledge. In Proceeding of the Twenty-Sixth Annual SIGCHI Conference on Human Factors in
Computing Systems (Florence, Italy, April 05 - 10, 2008). CHI '08. ACM, New York, NY, 193-196.
DOI= http://doi.acm.org/10.1145/1357054.1357086.
[31] Vaidya, J. and Clifton, C. 2004. Privacy-Preserving Data Mining: Why, How, and When. IEEE
Security and Privacy 2, 6 (Nov. 2004), 19-27. DOI= http://dx.doi.org/10.1109/MSP.2004.108.
[32] Why social networking spam reaps more rewards than e-mail.
http://www.allspammedup.com/2009/11/why-social-networking-spam-reaps-more-rewards-than-
email/.
49
[33] Yardi, S., Feamster, N., and Bruckman, A. 2008. Photo-based authentication using social networks. In
Proceedings of the First Workshop on online Social Networks (Seattle, WA, USA, August 18 - 18,
2008). WOSP '08. ACM, New York, NY, 55-60. DOI= http://doi.acm.org/10.1145/1397735.1397748.
[34] Yu, H., Gibbons, P. B., Kaminsky, M., and Xiao, F. 2008. SybilLimit: A Near-Optimal Social Network
Defense against Sybil Attacks. In Proceedings of the 2008 IEEE Symposium on Security and Privacy
(May 18 - 21, 2008). SP. IEEE Computer Society, Washington, DC, 3-17. DOI=
http://dx.doi.org/10.1109/SP.2008.13.
[35] Yu, H., Kaminsky, M., Gibbons, P. B., and Flaxman, A. 2006. SybilGuard: defending against sybil
attacks via social networks. SIGCOMM Comput. Commun. Rev. 36, 4 (Aug. 2006), 267-278. DOI=
http://doi.acm.org/10.1145/1151659.1159945.