web science & social networks - kbs - kbs · summer 2019 web science & social networks...
TRANSCRIPT
Summer 2019
Web Science & Social networks Planetary-Scale Views on a Large Instant-Messaging Network
Professor: Prof. Dr. Wolfgang Nejdl Mentor: Philipp Kemkes Student: Masih Ghaderi
2
Top1
Questions:
Is there any specific pattern for human
communication?
Which factors have influence on the way
humans communicate?
Is there any rules in social interaction?
and
How the existing knowledge in Web science
could help us to find the answers?
Web Science
3
Top1
Introduction to social networks
To implement the structure of the human social connections:
Node
Edge
Web Science
Individual
Nodes: 1
Edges: 0
Nodes: 3
Edges: 2
Nodes: 16
Edges: 15 (nodes-1)
4
Top1
Approach
Statistical method to examine characteristics and patterns of large numbers of
people
Actions and characteristics of individuals not being considered
Field of experiment
A communication graph
One month of high-level communication activities using the Microsoft messenger instant-messaging (IM) network
180 million nodes
1.3 billion undirected edges
Structure and communication based on user demographic attributes, such as gender, age, language, and location
Web Science
5
Top1
Dataset structure
Each user is represented by a node
An edge is placed between users who had at least one conversation during the
month of observation
A dataset of 30 billion conversations
240 million distinct users over one month
Users’ Relationship
Strong homophily among users; more and longer
durations conversations with people who are
similar to themselves.
Web Science
6
Top1
Data structure
Split into three parts:
Web Science
Presence data
Communication data
Demographic information
DATA login, logout, first ever
login, add, remove and
block a buddy, invite new
user, change of status
session id, user id, time
joined the session, time
left the session,
messages sent/ received
age, gender, location
(country, ZIP), language,
and IP address
7
Usage & population statistics
Levels of activity
• 242,720,596 users logged into Messenger
• 179,792,538 of them engaged in conversations
Demographic characteristics
• users with reported ages in the 15–35 span of
years are strongly overrepresented
Web Science
8
Top1
Communication demographics
Geography, location, age, and gender influence observed communication
patterns
Communication by age
• Most conversations occur between people
of ages 10 to 20
• people tend to talk to people of similar age
(especially for age groups between 10 and
30 years)
Web Science
9
Top1
Web Science
Older people tend to have
longer conversations (b)
Older people exchange
more messages (c)
Younger people have
faster-paced dialogs (d)
10
Communication by gender
The Messenger population consists of 100 million males and 80 million females
by self-report
Web Science
Conversations occur
50% between male and
female,
40% the same gender
Average conversation length in seconds, Male–male 4,
Female–female 4.5,
Female–male 5 minutes on average
11
Top1
World geography and communication
Using Messenger is very dense in North America, Europe, and Japan, as well as
coastal regions around the world
Web Science
12
Top1
Communication among countries
• Historical and ethnical factors have significant roles (language, culture etc.)
• United States and Spain appear to serve as hubs
Web Science
13
Top1
Communication and geographical distance
• The number of conversations decreases
with distance
• The tendency to communicate with others
within a local context and environment
Web Science
14
p1 Strength of the ties
Removing a few high degree nodes can have a dramatic influence on the
connectivity of a network
Removing users with long conversations is more effective for breaking the
connectivity of the network than for random node deletion
Web Science
15
Planetary-scale social network validates the well-known “6 degrees of
separation”
p1 6 degrees of separation
Earlier study
• A sample of 64 people
• The number of hops for a letter to travel
from Nebraska to Boston
• The average number was 6.2
New study
Randomly sampled 1000 nodes
Calculated for each node the shortest paths to
all other nodes
The average path length is 6.6
Web Science
16
Top1
Conclusion
Until now the biggest scientific study in this field (not technically but scientifically
important)
The communication patterns of all people using a popular IM system
Messenger data gives us a unique opportunity to study distances in the social
network
The core dataset contains more than 30 billion conversations among 240 million
people
The planetary-scale network allowed us to explore dependencies among user
demographics, communication characteristics, and network structure
Cross-gender conversations are both more frequent and of longer duration than
conversations with users of the same reported gender
Validated the earlier research that found “6 degrees of separation” among people
Web Science
17
Thank you