how to analyse social network? : part 2 social networks can be represented by complex networks

Post on 17-Jan-2016

222 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

TRANSCRIPT

How to Analyse Social Network? : Part 2

Social networks can be represented by complex networks.

Reviews

Social network is a social structure made up of individuals (or organizations) called “nodes”, which are connected by one or more types of relationships, represented by “links”. Friendship Kinship Common Interest ….

Graph-based structures are very complex.

2

Source: http://followingfactory.com/

Introduction

Various nature and society systems can be described as complex networks social systems, biological systems, and communication systems.

3

By presented as a graph, vertices (nodes) represent individuals or organizations and edges (links) represent interaction among them

Source: http://www.fmsasg.com/SocialNetworkAnalysis

Introduction

Why is network anatomy so important to characterize? Because structure always affects function.

For instance, the topology of social networks affects the spread of information.

4

Introduction

Network Models Regular Networks: chains, grids, lattices and fully-c

onnected graphs Random network model by Erdős and Rényi: ER model Small-world phenomenon by Watts and Strogatz: WS

model Scale-free network model by Barabási and Albert: BA

model Evolution mechanism of network structures are very

interested among many researchers not only engineering but also physics communities.

5

Types of Network Models

Regular Networks

1. Ring of ten nodes connected to their nearest neighbours.

2. Fully connected network of ten nodes

6

Types of Network Models

Random Networks placing n nodes on a

plane, joining pairs of them

together at random until m links are used.

Nodes may be chosen more than once, or not at all.

7

Types of Network Models

Random Networks Erdös and Rényi studied how the expected topology

of this random graph changes as a function of m.

When m is small, the graph is likely to be fragmented into many small clusters of nodes, called components.

As m increases, the components grow, at first by linking to isolated nodes and later by coalescing with other components.

8

Types of Network Models

Random Networks A phase transition occurs at m = n/2, where many clusters

crosslink spontaneously to form a single giant component.

For m > n/2, this giant component contains on the order of n nodes (its size scales linearly with n), while its closest rival contains only about log n nodes.

All nodes in the giant component are connected to each other by short paths: the maximum number of 'degrees of separation' between any two nodes grows slowly, like log n

9

Types of Network Models

Random Networks Gene networks

Ecosystems

Spread of infectious diseases

Computer viruses

10

Types of Network Models

Small-World Networks Watts and Strogatz studied a simple model that

can be tuned through this middle ground: a regular lattice where the original links are replaced by

random ones with some probability 0<p< 1.

the slightest bit of rewiring transforms the network into a 'small world', with short paths between any two nodes, just as in the giant component of a random graph.

11

Types of Network Models

Small-World Networks the network is much more highly clustered than a rando

m graph,

if A is linked to B and B is linked to C, there is a greatly increased probability that A will also be linked to C

two properties — short paths and high clustering —for many natural and technological networks

12

Types of Network Models

Small-World Networks Starts with a ring of n nodes, each connected by undirected links

to its nearest and next-nearest neighbours out to some range k.

Shortcut links are then added — rather than rewired — between randomly selected pairs of nodes, with probability p per link on the underlying lattice; thus there are typically nkp shortcuts in the graph

How many steps are required to go from one

node to another along the shortest route?

13

Types of Network Models

Small-World Networks how to actually find a short chain of

acquaintances linking yourself to a random target person

search problems

14

Types of Network Models

Scale-Free Networks Some nodes are more highly connected t

han others are.

To quantify this effect, let p denote the fraction of nodes that have k links.

k is called the degree and p is the degree distribution.

connectivity probability distribution P(k) of a node connecting to k other nodes is a power-law degree distribution,

where k is the degree of a node and γ is a scalar exponent

15

Types of Network Models

Scale-Free Networks The probability of attachment is

proportional to the degree of the target node; thus richly connected nodes tend to get richer, leading to the formation of hubs and a skewed degree distribution with a heavy tail.

Red, k=33 links; blue, k=12; green, k=11. Here n=200 nodes, m=199 links..

16

Types of Network Models

Scale-Free Networks Resistant to random fail

ures because a few hubs dominate their topology

17

Types of Network Models

Most large networks have been demonstrated that they have scale-free features according to the BA network properties. There are two issues of realistic networks that are not

related in both ER and WS network properties.

The first issue is a network grows. Both network models start with a fixed number of nodes (size

of network) without modifying it. It means the size of network is constant.

18

Types of Network Models

Most real networks are growing continuously; new nodes are added in the system in anytime World-Wide-Web network grows by increasing the

new documents.

The second issue is a connectivity probability. Two nodes are connected together with randomly

selection in the random network. Most real networks illustrate a preferential connection.

New documents in the World-Wide-Web network will link to popular documents with already high connectivity.

19

Types of Network Models

The BA properties can support these issues of realistic networks: The network expands continuously following a power

law distribution. The new nodes are added and connected with existing

nodes in the network. The new nodes are connected with the existing one

based on a preferential attachment; Higher connectivity probability to a node that has a large

number of connections.

20

Types of Network Models

The network of co-authorship relationships in SEG's journal Geophysics is scale-free 

21

Source: http://www.agilegeoscience.com/journal/tag/networks

Graph Representation of Networks Simple Graphs

DEF: A simple graph G = (V,E ) consists of a non-empty set V of vertices (or nodes) and a set E (possibly empty) of edges where each edge is a subset of V with cardinality 2 (an unordered pair).

22

How to analyse social networks?

Graph Representation of Networks Multigraphs

allow multiple edges, but still no self-loops

Pseudographs If self-loops are allowed.

23

How to analyse social networks?

L23 24

Undirected Graphs Terminology

Vertices are adjacent if they are the endpoints of the same edge.

Q: Which vertices are adjacent to 1? How about adjacent to 2, 3, and 4?

1 2

3 4

e1

e3

e2

e4e5

e6

L23 25

Undirected Graphs Terminology

A: 1 is adjacent to 2 and 32 is adjacent to 1 and 33 is adjacent to 1 and 24 is not adjacent to any vertex

1 2

3 4

e1

e3

e2

e4e5

e6

L23 26

Undirected Graphs Terminology

A vertex is incident with an edge (and the edge is incident with the vertex) if it is the endpoint of the edge.

Q: Which edges are incident to 1? How about incident to 2, 3, and 4?

1 2

3 4

e1

e3

e2

e4e5

e6

L23 27

Undirected Graphs Terminology

A: e1, e2, e3, e6 are incident with 1

2 is incident with e1, e2, e4, e5, e6

3 is incident with e3, e4, e5 4 is not incident with any edge

1 2

3 4

e1

e3

e2

e4e5

e6

L23 28

Digraphs

Last time introduced digraphs as a way of representing relations:

Q: What type of pair should each edge be (multiple edges not allowed)?

1

2

3

4

L23 29

Digraphs

A: Each edge is directed so an ordered pair (or tuple) rather than unordered pair.

Thus the set of edges E is just the represented relation on V.

1

2

3

4

(1,2)

(1,1)

(2,2)

(2,4)

(1,3)

(2,3)

(3,4)

(3,3)

(4,4)

L23 30

Digraphs

DEF: A directed graph (or digraph) G = (V,E ) consists of a non-empty set

V of vertices (or nodes) and a set E of edges with E V V.

The edge (a,b) is also denoted by a b and a is called the source of the edge while b is called the target of the edge.

Degree: The degree of a vertex counts the number of

edges that

Oriented Degree when Edges Directed: The in-degree of a vertex (deg-) counts the

number of edges that stick in to the vertex. The out-degree (deg+) counts the number

sticking out.

31

Network Analysis

Handshaking Theorem

THM: In an undirected graph

In a directed graph

32

Network Analysis

Ee

eE )deg(2

1 ||

EeEe

eeE )(deg )(deg ||

For a directed graph G = (V,E ) define matrix AG by: Rows, Columns –one for each vertex in V Value at i th row and j th column is

1 if i th vertex connects to j th vertex (i j ) 0 otherwise

For a directed multigraph G = (V,E ) define the matrix AG by:

Rows, Columns –one for each vertex in V Value at i th row and j th column is

The number of edges with source the i th vertex and target the j th vertex

33

Adjacency Matrix

Complete Graphs – Kn

A simple graph is complete if every pair of distinct vertices share an edge.

Cycles Graphs – Cn

The cycle graph Cn is a circular graph.

Wheels Graphs- Wn The wheel graph Wn is just a cycle graph with an extra vertex in

the middle

Bipartite Graphs A simple graph is bipartite if V can be partitioned into V = V1 V2

so that any two adjacent vertices are in different parts of the partition. No two vertices of the same party are adjacent.

34

Other Types of Graphs

There are various measures of the centrality of a vertex within a graph that determine the relative importance of a vertex within the graph how important a person is within a social network

who is the most well-known author in the citation network

35

Centrality Measures

Degree centrality Degree centrality is defined as the number of links incident upon

a node (i.e., the number of ties that a node has).

Degree is often interpreted in terms of the immediate risk of node

for catching whatever is flowing through the network such as a virus, or some information.

If the network is directed (meaning that ties have direction), then we usually define two separate measures of degree centrality, namely indegree and outdegree.

36

Centrality Measures

Degree centrality Indegree is a count of the number of ties directed to

the node. Outdegree is the number of ties that the node directs

to others. For positive relations such as friendship or advice, we

normally interpret indegree as a form of popularity, and outdegree as gregariousness.

37

Centrality Measures

Degree centrality An entity with high degree centrality:

Is generally an active player in the network. Is often a connector or hub in the network. Is not necessarily the most connected entity in the network

(an entity may have a large number of relationships, the majority of which point to low-level entities).

May be in an advantaged position in the network. May have alternative avenues to satisfy organizational

needs, and consequently may be less dependent on other individuals.

Can often be identified as third parties or deal makers.

38

Centrality Measures

Degree centrality An entity with high degree centrality:

Alice has the highest degree centrality, which means that she is quite active in the network. However, she is not necessarily the most powerful person because she is only directly connected within one degree to people in her clique—she has to go through Rafael to get to other cliques.

39

Centrality Measures

Source: http://www.fmsasg.com/SocialNetworkAnalysis/

Degree centrality

40

Centrality Measures

Betweenness Centrality Betweenness is a centrality measure of a vertex within

a graph. Vertices that occur on many shortest paths between

other vertices have higher betweenness than those that do not.

41

Centrality Measures

Betweenness Centrality An entity with a high betweenness centrality

generally: Holds a favored or powerful position in the network. Represents a single point of failure—take the single

betweenness spanner out of a network and you sever ties between cliques.

Has a greater amount of influence over what happens in a network.

42

Centrality Measures

Betweenness Centrality An entity with a high betweenness centrality

generally:

Rafael has the highest betweenness because he is between Alice and Aldo, who are between other entities. Alice and Aldo have a slightly lower betweenness because they are essentially only between their own cliques. Therefore, although Alice has a higher degree centrality, Rafael has more importance in the network in certain respects.

43

Centrality Measures

Source: http://www.fmsasg.com/SocialNetworkAnalysis/

Betweenness centrality

44

Centrality Measures

Closeness Centrality Closeness is one of the basic concepts in a topological

space. We say two sets are close if they are arbitrarily near to each

other. The concept can be defined naturally in a metric space where a

notion of distance between elements of the space is defined, but it can be generalized to topological spaces where we have no concrete way to measure distances.

45

Centrality Measures

Closeness Centrality Closeness is a centrality measure of a vertex within a graph.

Vertices that are 'shallow' to other vertices (that is, those that tend to have short geodesic distances to other vertices with in the graph) have higher closeness.

Closeness is preferred in network analysis to mean shortest-path length, as it gives higher values to more central vertices, and so is usually positively associated with other measures such as degree.

Closeness centrality measures how quickly an entity can access more entities in a network

46

Centrality Measures

Closeness Centrality An entity with a high closeness centrality

generally: Has quick access to other entities in a network. Has a short path to other entities. Is close to other entities. Has high visibility as to what is happening in the

network.

47

Centrality Measures

Closeness Centrality

Rafael has the highest closeness centrality because he can reach more entities through shorter paths. As such, Rafael's placement allows him to connect to entities in his own clique, and to entities that span cliques.

48

Centrality Measures

Source: http://www.fmsasg.com/SocialNetworkAnalysis/

Hub and Authority (for directed graph) If an entity has a high number of relationships pointing to it, it has a high

authority value, and generally: Is a knowledge or organizational authority within a domain. Acts as definitive source of information.

Hubs are entities that point to a relatively large number of authorities. They are essentially the mutually reinforcing analogues to authorities. Authorities point to high hubs. Hubs point to high authorities. You cannot have one without the other.

49

Centrality Measures

Source: http://www.fmsasg.com/SocialNetworkAnalysis/

Eigenvector Centrality Eigenvector centrality is a measure of the

importance of a node in a network. It assigns relative scores to all nodes in the network based on the principle that connections to high-scoring nodes contribute more to the score of the node in question than equal connections to low-scoring nodes.

Google's PageRank is a variant of the Eigenvector centrality measure.

50

Centrality Measures

Eigenvector Centrality

51

Centrality Measures

Eigenvector Centrality

52

Centrality Measures

53

Centrality Measures

54

RFID Datenvolumen Centrality Measures

PageRank

Only Structure Consideration

Knowledge of Global Network Structure

Broken Link Problems

KONECT: the Koblenz Network Collection contains 168 network datasets (for instance)

Animal networks are networks of contacts between animals. Authorship networks are unweighted bipartite networks

consisting of links between authors and their works. Citation networks consist of documents that reference each

other. Coauthorship networks are unipartite network connecting

authors who have written works together. Communication networks contain edges that represent

individual messages between persons. consists of Matlab code to generate statistics and plots

about them

55

Social Network Analysis Software

Source: konect.uni-koblenz.de/networks

“Pajek”: Large Network Analysis Software

5757

Introduction to Slovenian Spider: Pajek

http://vlado.fmf.uni-lj.si/pub/networks/pajek/ Free software Windows 32 bit

Pajek 2.05

“Whom would you choose as a friend ?”

5858

Introduction

Its applications: Communication networks: links among pages or

servers on Internet, usage of phone calls Transportation networks Flow graphs of programs Bibliographies, citation networks

59

Data Structures

Six data structures: Network(*.net) – main object (vertices and lines - arcs, edg

es) Partition(*.clu) – nominal property of vertices (gender); Vector(*.vec) – numerical property of vertices; permutation (*.per) – reordering of vertices; cluster (*.cls) – subset of vertices (e.g. a cluster from partiti

on); hierarchy (*.hie) – hierarchically ordered clusters and vertic

es.

60

Introduction

Pajek 2.05

61

Network Definitions

Graph Theory Graphs represent the structure of networks

Directed and undirected graphs Lists of vertices arcs and edges, where each arch

and edge has a value. To view the network data files: NotePad, EditPlus

62

Network Data File

62

Open Network Data File (*.net)

Number of Vertices

63

Transform

Transform

64

Report Information

65

Visualization

Energy – Idea: the network is represented like a physical system, and we are searching for the state with minimal energy. Two algorithms are included:

Layout/Energy/Kamada-Kawai – slower Layout/Energy/Fruchterman-Reingold – faster, drawing in a plane or space (2D or

3D), and selecting the repulsion factor

66

Network Creation

66

67

Partitions

File name: *.clu

68

Degree

Social Network Analysis: Theory and Applications

Graphs (ppt), Zeph Grunschlag, 2001-2002. KONECT:

http://konect.uni-koblenz.de/networks Pajek:

http://pajek.imfm.si/doku.php?id=download http://www.fmsasg.com/SocialNetworkAnalysis/

69

References

top related