minfs544: business network data analytics and applications00000000-6686-7286... · minfs544:...

26
MINFS544: Business Network Data Analytics and Applications Feb 24 th , 2016 Daning Hu, Ph.D., Department of Informatics University of Zurich F Schweitzer et al. Science 2009

Upload: others

Post on 27-Jun-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

MINFS544: Business Network Data

Analytics and Applications

Feb 24th , 2016

Daning Hu, Ph.D.,

Department of Informatics

University of Zurich

F Schweitzer et al. Science 2009

Page 2: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Stop Contagious Failures in Banking Systems

During 2008 financial tsunami, which bank(s) we should inject capital first to stop contagious failures in bank networks? 2

Page 3: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Utilize Peer Influence in Online Social Networks

Intelligent Advertising, Product Recommendation

Who are the most influential people?

What are the patterns of product diffusion? 3

Page 4: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Develop Strategies to Attack Terrorist Networks

A Global Salafi Jihad Terrorist NetworkHu et al. JHSEM 2009

How to effectively break down a terrorist network?4

Page 5: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Network-based Business Intelligence

5

Network-based (Modeling and Analysis)

Modeling and analyzing various real-world social and organizational

networks to understand:

the cognitive and economic behaviors of the network actors; and

the dynamic processes behind the network evolution

Based on the above…

Business Intelligence (BI)

Design network-based BI algorithms and information systems to

provide decision support in various application domains

Financial Risk Management, Security Informatics, and Knowledge

Management, etc.

Network Analysis, Simulation of Network Evolution, Data Mining, etc.

Page 6: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Summary

• Lecturer: Dr. Daning Hu; Teaching Assistant: David Xiao Li

• Email: [email protected] [email protected]

• Credits: 3 ECTS credits

• Course web page:

http://www.ifi.uzh.ch/bi/teaching/Spring2016/Lecture1.html

• Language: English

• Audience: Master and doctoral students

• Office Hours: Tue 13:00–14:00 PM, Room 2.A.12, Please

send emails to make appointments.

• Grading: Course report (term paper) 80% and interactions

20%

Page 7: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

7

Grading

• 1. A full research paper (80%). The format of this paper can

be found at:

• http://icis2016.aisnet.org/call-dates/submission-guidelines/

• * If possible, get it published in ICIS 2015 and get it cited.

• This paper should include answers to the following

questions:

– What is the research problem?

– Why is it interesting and important?

– Why is it hard? Why have previous approaches failed?

– What are the key components of your approach?

– What 1) models, 2) data sets and 3) metrics will be used to validate

the approach?

Page 8: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

A Brief History of Network Science

8

Mathematical foundation – Graph Theory1736

1930 Social Network Analysis and Theories

Sociogram: Network visualization

Six degree of separation

Structural hole: Source of innovation

Network Science Economic networks (Agent modeling & simulation)

Dynamic network analysis

BI applications: product diffusion in social media, recommendation systems

1990 (Physicists) Complex Network Topologies

Small-world model (e.g., WWW)

Scale-free model (“Rich get richer”)2000

2012

?

Page 9: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Outline

9

Introduction

Dynamic Analysis of Dark Networks

A Global Salafi Jihad (GSJ) Terrorist Network

A Narcotic Criminal Network

A Network Approach to Managing Bank Systemic Risk

Ongoing Work

Conclusion

Page 10: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Dynamic Network Analysis (DNA)

10

What Why How

Model the changes in

network evolution

Temporal changes in

network topological

measures

Dynamic network

recovery on

longitudinal data

Studying dynamic link formation processes behind

network evolution.

Nodes forming links Network Evolution

Statistical analysis of

determinants behind

link formation

Homophily

Preferential

attachment

Shared affiliations

Simulate the

evolution of networks

Agent-based

Modeling and

Simulation

Examine network

robustness

Page 11: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Research Testbed: A Global Terrorist Network

11

The Global Salafi Jihad (GSJ) network data is compiled by a

former CIA operation officer Dr. Marc Sageman - 366 terrorists

friendship, kinship, same religious leader, operational interactions, etc.

geographical origins, socio-economic status, education, etc.

when they join and leave GSJ

The goal of dynamic analysis

gain insights about the evolution of GSJ network

develop effective attack strategies to break down GSJ network

Sample data of GSJ terrorists

Page 12: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

12

a

Page 13: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

13

Page 14: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Dynamic Network Analysis

14

What Why How

Model the changes

in network evolution

Temporal changes in

network topological

measures

Dynamic network

recovery on

longitudinal data

Studying dynamic processes (i.e., link formation) behind

network evolution.

Nodes’ behaviors Network Evolution

Statistical analysis of

determinants behind

link formation

Homophily

Preferential

attachment

Shared affiliations

Simulate the

evolution of networks

Agent-based

Modeling and

Simulation

Examine network

robustness

Page 15: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Temporal Changes in Network-level Measures

Average Degree <k >

0

2

4

6

8

10

12

14

16

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

2003

deg

ree

<k>

Fig.1. The temporal changes in the (a)

average degree, (b) and (c) degree

distribution

Degree = number of links a node has

a

b

c

0.00

0.03

0.06

0.09

0.12

0.15

0.18

0.21

0.24

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49

pro

bab

ilit

y o

f d

eg

ree

1990

1991

1993

Poisson

0.00

0.03

0.06

0.09

0.12

0.15

0.18

0.21

0.24

1 4 7 10 13 16 19 22 25 28 31 34 37 40 43 46 49 52

pro

bab

ilit

y o

f d

eg

ree

1995

1997

1999

Page 16: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Findings

16

There are three stages for the evolution of the GSJ network:

1989 - 1993 The emerging stage:

The network grows in size

Accelerated Growth - No. of edges increases faster than nodes

Random network topology (Poisson degree distribution)

1994 - 2000 The mature stage:

The size of the network reached its peak in 2000

Scale-free topology (Power-law degree distribution)

2001 - 2003 The disintegration stage:

Falling into small disconnected components after 9/11

Page 17: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Temporal Changes in Node Centrality Measures

17

0

10

20

30

40

50

60

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

Degree

0

500

1000

1500

2000

2500

3000

3500

4000

4500

5000

1989

1990

1991

1992

1993

1994

1995

1996

1997

1998

1999

2000

2001

2002

Betweenness

Figure.2. Temporal changes in

Degree and Betweenness centrality

of Osama Bin Laden

Degree: No. of links a node has

Betweenness of a node i

No. of shortest paths from all nodes to

all others that pass through node i

Measure i’s influence on the traffic

(information, resource) flowing through it

Page 18: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Findings and Possible Explanations

18

1994 – 1996: A sharp decrease in Bin Laden’s Betweenness

1994: Saudi revoked his citizenship and expelled him

1995: Went to Sudan and was expelled again under U.S. pressure

1996: Went to Afghanistan and established camps there

1998 –1999: Another sharp decrease in his Betweenness

After 1998 bombings of U.S. embassies, Bill Clinton ordered a freeze

on assets linked to bin Laden (top 10 most wanted)

August 1998: A failed assassination on him from U.S.

1999: UN imposed sanctions against Afghanistan to force the Taliban

to extradite him

Page 19: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Research Testbed: A Narcotic Criminal Network

The COPLINK dataset contains 3 million police incident reports from the Tucson Police Department (1990 to 2006).

3 million incident reports and 1.44 million individuals

Their personal and sociological information (age, ethnicity, etc.)

Time information: when two individuals co-offend

AZ Inmate affiliation data: when and where an inmate was housed

A Narcotic Criminal Network

19,608 individuals involved in organized narcotic crimes

29,704 co-offending pairs (links)

19

COPLINK

Narcotic Data

Arizona Inmate

Data

Overlapped (identified by first

name, last name and DOB)

Number of People 36,548 165,540 19,608

Time Span 1990 - 2006 1985 - 2006 17 years

Table 1. Summary of the COPLINK dataset and the Arizona inmate dataset

Page 20: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Proportional hazards model (Cox Regression Analysis)

Homophily in age (group) and race

Shared affiliations:

Mutual acquaintances (through crimes)

Vehicle affiliation (same vehicle used by two in different crimes)

20

h(t, x1, x2, x3...) = h0(t)exp(b1x1 +b2x2 +b3x3...)

Statistical Analysis of Determinants for Link Formation

Fig.3. Results of

multivariate survival

(Cox regression)

analysis of triadic

closure (link formation).

Page 21: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

IBM’s COPLINK is an intelligent police information system aims to to help speed up the crime detection process.

COPLINK calculates the co-offending likelihood score based on the proportional hazards model .

A ranked list of individuals based on their predicted likelihood of

co-offending with the suspect under investigation.

21

BI Application: Co-offending Prediction in COPLINK

Fig.4. Screenshots

of the COPLINK

system

Page 22: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Simulate Attacks on Dark Networks

22

Three attack (i.e. node removals) strategies:

Attack on hubs (highest degrees)

Attack on bridge (highest betweenness)

Real-world Attack (Attack order based on real-world data)

Simulate two types of attacks to examine the robustness

of the Dark networks

Simultaneous attacks (the degree/betweenness of nodes are NOT

updated after each removal) – Static

Progressive attacks (the degree/betweenness of nodes are

updated after each removal) – Dynamic

Page 23: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

Hub Vs. Bridge Attacks

23

Both hub and bridge attacks are far more effective than real-

world arrests – Policy implications?

Both Dark networks are more vulnerable to Bridge attacks

than Hub attacks.

Bridge (highest beweenness): Field lieutenants, operational leaders, etc.

Hub (highest degree) : e.g., Bin Laden

GSJ

0

0.1

0.2

0.3

0.4

0.5

0.6

0.7

0.8

0.9

1

0 0.2 0.4 0.6 0.8 1

Fraction of nodes removed

S a

nd <

s>

S (Hub attacks)

S (Bridge attacks)

Page 24: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

24

Summary and Contributions

We developed a set of Dynamic Network Analysis (DNA)

methods that are effective in

Linking network topological changes to analytical insights

Systematically capturing the link formation processes

Examining the determinants of link formation

Dark networks are

robust against real-world attacks

but vulnerable to targeted bridge attacks

COPLINK provides real-time decision support for fighting crimes.

Page 25: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

25

Research Readings and Resources

• 1. Networks Overview:

• * Statistical mechanics of complex networks, Section III, VI

– http://rmp.aps.org/abstract/RMP/v74/i1/p47_1

• * Networks, Crowds, and Markets:

– http://www.cs.cornell.edu/home/kleinber/networks-book/

• 2. Networks in Finance:

• * Financial Networks blog and research databases:

– WRDS database

– http://www.financialnetworkanalysis.com/research-database/

– http://www.stern.nyu.edu/networks/electron.html

– * Company Board Social Networks

Page 26: MINFS544: Business Network Data Analytics and Applications00000000-6686-7286... · MINFS544: Business Network Data Analytics and Applications Feb 24th, 2016 Daning Hu, Ph.D., Department

26

Research Readings and Resources (cont.)

• 3. Networks in Marketing:

– * Sinan Aral’s research in networks and marketing

– Peer influence

– http://web.mit.edu/sinana/www/

• * Social Media based Marketing:

– http://searchengineland.com/guide/what-is-social-media-marketing

• 4. Recommender Systems:

– http://www-cs-students.stanford.edu/~adityagp/recom.html

• 5. Word-of-Mouth Effects in Social Networks:

– http://papers.ssrn.com/sol3/papers.cfm?abstract_id=393042&