micro-interactions and macro-observations
DESCRIPTION
Micro-interactions and Macro-observations. Klaas Dellschaft. Example: Naming Game (I). Micro-interactions … Mother talking to her child http://www.youtube.com/watch?v=kiGduwJK6SQ Macro-observations … Child learns to speak. Example: Naming Game (II). ???. Kuh. User2:. - PowerPoint PPT PresentationTRANSCRIPT
Web Science & Technologies
University of Koblenz ▪ Landau, Germany
Micro-interactions and Macro-observations
Klaas Dellschaft
Klaas [email protected]
Introduction to Web Science2 of 30
WeST
Example: Naming Game (I)
Micro-interactions … Mother talking to her child
http://www.youtube.com/watch?v=kiGduwJK6SQ
Macro-observations … Child learns to speak
Klaas [email protected]
Introduction to Web Science3 of 30
WeST
Kuh
CowKuh
???Example: Naming Game (II)
Kuh
Cow
Cow
Kuh
Cow
Kuh
User1:
User2:
User 3:
User roles: Speaker/ Hearer Speaker: Speaks a word Hearer: Tries to guess which object was meant Successful round: Hearer makes a correct guess Objective: Maximize the number of successful rounds http://talking-heads.csl.sony.fr
Kuh
???
???
Klaas [email protected]
Introduction to Web Science4 of 30
WeST
Example: Naming Game (III)
Micro-level interactions … Speaker / hearer Round successful?
• Yes: Reinforce the used word
• No: Learn new word
Macro-level observations … Stable vocabulary emerges over time For each object / attribute, only one word survives
Naming game explains how languages may emerge Why are there many different languages on the world?
Naming game ignores geographic distribution of agents
Klaas [email protected]
Introduction to Web Science5 of 30
WeST
Model-based research
Modeling micro-interactions Define rules for interactions between agents Use rules for simulating the dynamics in a system Objective: Explain the emergence of macro-observations
Use cases: Biology: Spreading of diseases in a population Sociology: Emergence of different cultural habits Web Science:
• Spreading of memes / hashtags in Twitter
• Emergence of a collaborative vocabulary in tagging systems
• …
Klaas [email protected]
Introduction to Web Science6 of 30
WeST
Basic Models (I)
Preferential Attachment (Polya Urn Model) There are n balls with different colors in an urn In each step:
• Randomly draw a ball
• Put it back together with a second ball of the same colorFixed number of colors Colors are distributed according to a power law
Klaas [email protected]
Introduction to Web Science7 of 30
WeST
Basic Models (II)
Linear Preferential Attachment (Simon Model) Like the Polya Urn Model. Additionally in each step:
• Instead of drawing a ball, insert with low probability p a ball with a new color
Linear increasing number of colorsColors are distributed according to a power law
Klaas [email protected]
Introduction to Web Science8 of 30
WeST
Basic Models (III)
Information Cascades Users decide rationally between alternatives
• Example: Accept (A) / Reject (R) Each user gets private information
• When the correct decision is to accept, the user more likely gets the information to accept (i.e. P(A) > 0.5)
Each user sees the decision of the previous users Rational choice:
• Adopt the choice of the majority of previous users and private information
Choice only relies on decision of previous users, if the difference in votes between A and R increases beyond 2
All subsequent users adopt the same choice cascadeNot necessarily the correct decision is cascaded!!!
Klaas [email protected]
Introduction to Web Science9 of 30
WeST
Method of Model-based ResearchM
od
elR
eali
ty
Micro-interactions Macro-observations
Stochastic Model
Assumed rules of interaction
Simulated Properties
Unknown Model Observed PropertiesC
ompare
Unknown rules of interaction
Klaas [email protected]
Introduction to Web Science10 of 30
WeST
Use Case: Spreading of Memes in Twitter (I)
Meme: Topic / idea that is discussed in Twitter Observables:
Lifetime of tweets in Twitter (in hours) Number of people contributing to a meme (per day)
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315179/
Klaas [email protected]
Introduction to Web Science11 of 30
WeST
Use Case: Spreading of Memes (II)
Assumed rules of interaction: Each user can see memes posted by his friends Each user remembers his own previously tweeted memes When tweeting, a user either …
• … invents a new meme, or …
• … randomly selects a meme posted by his friends, or …
• … randomly picks up one of his previously tweeted memes Users only remember the last n tweets of their friends
and/or of their own
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315179/
Klaas [email protected]
Introduction to Web Science12 of 30
WeST
Use Case: Spreading of Memes (III)
Comparing simulation and reality: Empirical observations are better reproduced when
assuming a social network between users Structure of the friendship network influences meme spreading
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3315179/
Klaas [email protected]
Introduction to Web Science13 of 30
WeST
Details of Model-based Research
How to represent observables? Distribution functions
How to compare simulation and reality? Analytical evaluation Visual comparison Goodness-of-fit tests
How to decide between competing models?
Klaas [email protected]
Introduction to Web Science14 of 30
WeST
Method of Model-based ResearchM
od
elR
eali
ty
Micro-interactions Macro-observations
Stochastic Model
Assumed rules of interaction
Simulated Properties
Unknown Model Observed PropertiesC
ompare
Unknown rules of interaction
Klaas [email protected]
Introduction to Web Science15 of 30
WeST
Use Case: Dynamics in Tagging Systems
Do the users agree on how to describe a resource? How do users influence each other in tagging systems?
Klaas [email protected]
Introduction to Web Science16 of 30
WeST
Folksonomies
Vertexes: Users, tags, resources Hyperedges: Tag assignments (user X tag X resource) Postings:
Tag assignments of a user to a single resource Can be ordered according to their time-stamp
Klaas [email protected]
Introduction to Web Science17 of 30
WeST
Co-occurrence Streams
Co-occurrence Streams: All tags co-occurring with a given tag in a posting Ordered by posting time
Example tag assignments for ‘ajax': {mackz, r1, {ajax, javascript}, 13:25} {klaasd, r2, {ajax, rss, web2.0}, 13:26} {mackz, r2, {ajax, php, javascript}, 13:27}
Resulting co-occurrence stream:
Tag |Y| |U| |T| |R|ajax 2.949.614 88.526 41.898 71.525blog 6.098.471 158.578 186.043 557.017xml 974.866 44.326 31.998 61.843
javascript rss web2.0 php javascript
time
Klaas [email protected]
Introduction to Web Science18 of 30
WeST
Co-occurrence Streams – Tag Frequencies
Zipf Plot of the tag frequencies
Klaas [email protected]
Introduction to Web Science19 of 30
WeST
Probability Distributions
Measuring the probability of a certain event Examples:
Rolling a dice – How often do we get the 1, 2, 3, …? Questionnaires – How often do people check the 1, 2, …
on a scale from 1 to 10? Tagging – How often is the tag ‘ajax’ used? Tagging – How many of the used tags are used 1-time,
2-times, …?
Different types of measurement scales
Klaas [email protected]
Introduction to Web Science20 of 30
WeST
Probability Distributions – Measurement Scales (I)
Nominal scale Ordinal scale Interval scale Ratio scale
Source: http://de.wikipedia.org/wiki/Skalenniveau
Klaas [email protected]
Introduction to Web Science21 of 30
WeST
Probability Distributions – Measurement Scales (II)
0
0,05
0,1
0,15
0,2
0,25
0,3
0,35
blog health food nutrition eating cooking
No
min
al S
cale
Ord
inal
Sca
le /
Inte
rval
Sca
le
0
0,1
0,2
0,3
0,4
0,5
0,6
1 2 3 4 5 6 7 8
Tag Frequency
Pro
bab
ility
of
Tag
s w
ith
Fre
qu
ency
x
Klaas [email protected]
Introduction to Web Science22 of 30
WeST
Probability Distributions – Representations (I)
Probability Distribution Function (PDF): P(X = x): Probability of observing an event x
Cumulative Distribution Function (CDF): P(X x): Probability of observing an event whose
value is x. Requires at least ordinal measurement scale. Example: Normal distribution
CDF
Source: http://en.wikipedia.org/wiki/Normal_distribution
Klaas [email protected]
Introduction to Web Science23 of 30
WeST
Probability Distributions – Representations (II)
Zipf plot Representation for distributions with nominal scale Assign ranks to the different categories
• Rank 1: Most often occurring category x-axis: Categories ordered by their ranks y-axis: Probability of category with rank x
Often used for representing word frequencies in texts Zipfs law:
Describes the relation between the rank k and the frequency f(k) of a word in natural language texts
0,);( skskf s
Klaas [email protected]
Introduction to Web Science24 of 30
WeST
Co-occurrence Streams – Tag Frequencies
Tag frequencies approx. follow Zipf’s law (straight line in Zipf plot with loga-rithmically scaled axes)
Klaas [email protected]
Introduction to Web Science25 of 30
WeST
Method of Model-based ResearchM
od
elR
eali
ty
Micro-interactions Macro-observations
Stochastic Model
Assumed rules of interaction
Simulated Properties
Unknown Model Observed PropertiesC
ompare
Unknown rules of interaction
Klaas [email protected]
Introduction to Web Science26 of 30
WeST
Comparing Reality and Model (I)
Visual comparison: Visually plot the real observables and the simulated results The closer together the plots, the better the model
Advantage: Easy to understand and to implementDisadvantage: Highly subjective (i.e. not a scientific
method)
Klaas [email protected]
Introduction to Web Science27 of 30
WeST
Comparing Model and Reality (II)
Analytical evaluation: Use mathematical methods for analyzing the model Proof that the simulation results have certain properties Example: Preferential attachment
• Frequency distribution of colors is a power-law• Color frequencies tend to a random limit
Advantages:Very deep understanding of the mechanismsMathematical dependencies between model parameters and
properties of the simulation results Disadvantages:
Analyzed models have to be “mathematically tractable”Does not show that simulated properties can also be observed in
reality
Klaas [email protected]
Introduction to Web Science28 of 30
WeST
Comparing Model and Reality (III)
Goodness-of-fit tests: First step:
• Define objective measure of distance between simulated and observed property
Relative measure of goodness-of-fitApplicable for any property
Second step:• Computer whether simulated and observed property are
statistically indistinguishableAbsolute measure of goodness-of-fitOnly applicable for properties that can be represented as
probability distributions
Klaas [email protected]
Introduction to Web Science29 of 30
WeST
Kolmogorov-Smirnov Test (Example)
Goodness-of-fit test for distributions with at least ordinal measurement scale Maximal distance between simulation and observation: |)()(|max 21 xSxSD
x
Klaas [email protected]
Introduction to Web Science30 of 30
WeST
Details of Model-based Research
How to represent observables? Distribution functions
How to compare simulation and reality? Analytical evaluation Visual comparison Goodness-of-fit tests
How to decide between competing models?
Friday!