scalable network analysis - university of texas at austin · scalable network analysis inderjit s....
TRANSCRIPT
![Page 1: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/1.jpg)
Scalable Network Analysis
Inderjit S. DhillonUniversity of Texas at Austin
Dec 20, 2013COMAD, Ahmedabad, India
![Page 2: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/2.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
OutlineUnstructured Data - Scale & Diversity
Evolving NetworksMachine Learning Problems arising in Networks
Recommender SystemsLink PredictionSign Prediction
Formulation as Missing Value EstimationScalable Algorithms
NOMAD: Distributed matrix completion algorithmResults on ApplicationsConclusions
2
![Page 3: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/3.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Structured Data
• Data organized into fields: Relational databases, spreadsheets, XML
• Highly optimized for storage & retrieval (e.g. using SQL)
3
![Page 4: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/4.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Structured Data
• Focus is on data format, efficient storage & search
• Less or no uncertainty in semantics: e.g. businesses know the fields of the data
4
![Page 5: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/5.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Unstructured Data
5
Modern data is unstructured and diverse
Networks, Graphs Text
Images, Videos
![Page 6: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/6.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Unstructured Data
6
Much greater growth rate
![Page 7: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/7.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Unstructured Data
• Dynamic aspects of unstructured data:
• Constantly evolving
• Uncertainties abound: What should I ask of the data?
• Seek insights
• Heterogeneity renders traditional database models inadequate
7
![Page 8: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/8.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Unstructured Data
• Buzzwords - “Big Data” & “Data Science”
• Machine Learning: Predictive models for data
• Engineering perspective: Scale matters
8
![Page 9: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/9.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Network Graphs
9
APQ6
MIP
AQP2
AQP1 AQP5
GK
Social networks(Friendship) Bipartite networks
(Membership, Ratings, etc.)
Gene networks(Functional interaction)
![Page 10: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/10.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Graph Evolution
• Social networks are highly dynamic
• Constantly grow, change quickly over time
• Users arrive/leave, relationships form/dissolve
• Understanding graph evolution is important
10
![Page 11: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/11.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Graphs meet Machine Learning
• Network analysis: Understanding structure & evolution of networks
• Formulate predictive problems on the adjacency matrix of the graph
• Confluence of graph theory & machine learning
11
![Page 12: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/12.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Facebook growth
Scalable Network Analysis
12
Link Prediction
Recommender Systems
Netflix problem: 100M ratings, 0.5M users, 20K movies
A Toy Problem In Comparison!
13
Use
rs
Movies
![Page 13: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/13.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Recommender Systems
13
U
sers
Movies
![Page 14: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/14.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Link Prediction in Social Networks
• Problem: Infer missing relationships from a given snapshot of the network
14
Network at time T Network at time T + 1
?
?
![Page 15: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/15.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Predicting gene-disease links
15
?
![Page 16: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/16.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Signed Social Networks
• Sign Prediction Problem: Given a snapshot of the signed social network, predict the signs of missing edges
16
Like
Disli
ke
Like / Dislike?
![Page 17: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/17.jpg)
Formulation as Missing Value Estimation
![Page 18: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/18.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
18
Use
rs
Movies
![Page 19: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/19.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
19
Use
rs
Movies
?
![Page 20: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/20.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
20
![Page 21: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/21.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
21
minW2Rm⇥k,H2Rn⇥k
X
(i,j)2⌦
(Aij �WiHTj )
2 + �(kWk2F + kHk2F )
![Page 22: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/22.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
22
?
minW2Rm⇥k,H2Rn⇥k
X
(i,j)2⌦
(Aij �WiHTj )
2 + �(kWk2F + kHk2F )
![Page 23: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/23.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Low-rank Matrix Completion
23
3.44
minW2Rm⇥k,H2Rn⇥k
X
(i,j)2⌦
(Aij �WiHTj )
2 + �(kWk2F + kHk2F )
![Page 24: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/24.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Link Prediction
• Can be posed as matrix completion problem
• Issue: Only positive relationships are observed
24
A(t) =
2
66664
. 1 1 . .1 . 1 1 11 1 . . 1. 1 . . .. 1 1 . .
3
77775⇡
2
66664
11111
3
77775
⇥1 1 1 1 1
⇤
Test Link Score
(4,5) 1
(1,4) 1
(3,4) 1
(1,5) 1
![Page 25: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/25.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Link Prediction
• Formulate Biased Matrix Completion Problem:
25
A(t) =
2
66664
. 1 1 . .1 . 1 1 11 1 . . 1. 1 . . .. 1 1 . .
3
77775⇡
2
66664
0.851.090.970.690.85
3
77775
⇥0.85 1.09 0.97 0.69 0.85
⇤
Test Link Score
(4,5) 0.59
(1,4) 0.59
(3,4) 0.67
(1,5) 0.72
minW2Rn⇥k,H2Rn⇥k
X
(i,j)2⌦
(Aij �WiHTj )
2 + ↵X
(i,j)/2⌦
(Aij �WiHTj )
2 + �(kWk2F + kHk2F )
![Page 26: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/26.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Signed Social Networks
• Social Balance [Harary,1953]: • In real-world signed networks, triangles tend to be
balanced
26
Friend of a friendis a friend
Enemy of an enemyis a friend
Balanced Not balanced
![Page 27: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/27.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Signed Social Networks
27
Theorem: All triangles in a network are balanced if and only if there exist two antagonistic groups.
![Page 28: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/28.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Signed Social Networks
• Relaxation: Weak balance
• Allow triangles with all negative edges
28
Weakly Balanced Not balanced
![Page 29: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/29.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Signed Social Networks
29
Theorem: All triangles in a network are weakly balanced if and only if there exist k antagonistic groups.
![Page 30: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/30.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Sign Prediction
• Sign inference can be posed as low-rank matrix completion
30
Theorem: A k-weakly balanced signed network has rank at most k.
Theorem: If there are no “small” groups, the underlying network can be exactly recovered, under certain
conditions.
K. Chiang et al. Prediction and Clustering in Signed Networks: A Local to Global Perspective. To appear in JMLR.
![Page 31: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/31.jpg)
Scalable Algorithms
![Page 32: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/32.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Stochastic Gradient
• Time per update
• Effective for very large-scale problems
32
wi wi � ⌘((Aij �wTi hj)hj + �wi)
hj hj � ⌘((Aij �wTi hj)wi + �hj)
• Sample random index (i,j) and update corresponding factors:
O(k)
![Page 33: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/33.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Distributed Stochastic Gradient Descent (DSGD) [Gemulla et al. KDD 2011]
• Decoupled updates
• Easy to parallelize
• But communication & computation are interleaved
33
![Page 34: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/34.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
DSGD
34
![Page 35: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/35.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
DSGD
35
Curse of the last reducer
![Page 36: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/36.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
DSGD
36
![Page 37: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/37.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
• Goal: Keep CPU & network simultaneously busy.
• Asynchronous distributed solution.
37
Non-locking stOchastic Multi-machine algorithm for Asynchronous & Decentralized matrix factorization
![Page 38: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/38.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
38
w1
w2
w3
w4
h1
h2
h4h5 h3
Nomadic variables queue
Native variables
Workers
![Page 39: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/39.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
39
w1
w2
w3
w4
h1
h2 h4
h5 h3
h4
Nomadic variables queue
Native variables
Push
Workers
![Page 40: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/40.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
40
![Page 41: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/41.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
41
![Page 42: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/42.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
42
![Page 43: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/43.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
43
![Page 44: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/44.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
44
![Page 45: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/45.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
45
![Page 46: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/46.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
46
![Page 47: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/47.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
47
![Page 48: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/48.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
48
![Page 49: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/49.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
49
![Page 50: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/50.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD
50
![Page 51: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/51.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD Algorithm
1. Initialize: Randomly assign columns to worker queues2. Parallel Foreach q in {1,2,...,p}3. If queue[q] not empty then4. queue[q].pop()5. for ( ) in do6. Do SGD updates7. end for8. Sample q’ uniformly from {1,2,...,p}9. queue[q’].push( )10. end if11. Parallel End
51
(j,hj)
(j,hj)
i, j ⌦(q)j
hj
![Page 52: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/52.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
NOMAD Algorithm
1. Initialize: Randomly assign columns to worker queues2. Parallel Foreach q in {1,2,...,p}3. If queue[q] not empty then4. queue[q].pop()5. for ( ) in do6. Do SGD updates7. end for8. Sample q’ uniformly from {1,2,...,p}9. queue[q’].push( )10. end if11. Parallel End
52
(j,hj)
(j,hj)
i, j ⌦(q)j
hj
Concurrent object
Distributed setting:Write over the network
![Page 53: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/53.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Algorithm Complexity
• Average space required per worker:
• Average time for one sweep:
53
O(mk/p+ nk/p+ |⌦|/p)
O(|⌦|k/p)
![Page 54: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/54.jpg)
Results on Applications
![Page 55: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/55.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Recommender Systems
55
Multicore Distributed
Netflix dataset: 2,649,429 users, 17,770 movies, ~100M ratings
![Page 56: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/56.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Recommender Systems
56
Synthetic dataset: ~85M users, 17,770 items, ~8.5B observations
![Page 57: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/57.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Link Prediction• Flickr dataset: 1.9M users & 42M links.
• Test set: sampled 5K users.
•
57
![Page 58: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/58.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Predicting Gene-Disease Links
58
0 10 20 30 40 50 60 70 80 90 1000
0.01
0.02
0.03
0.04
0.05
0.06
0.07
0.08
Rank
Pr(R
ank
<= x
)
Singleton gene rank
CATAPULT(Blom et al.,2013)KatzBiased MF
0 10 20 30 40 50 60 70 80 90 1000
0.02
0.04
0.06
0.08
0.1
0.12
0.14
0.16
Rank
Pr(R
ank
<= x
)
Test pair rank
CATAPULT(Blom et al.,2013)KatzBiased MF
![Page 59: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/59.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Sign Prediction• Epinions dataset (+ve & -ve reviews):
• 131K nodes, 840K edges, 15% edges negative.
• MF-ALS is faster and achieves higher accuracy.
•
59
MF-ALS takes 455 secs on network with
1.1M nodes & 120M edges
![Page 60: Scalable Network Analysis - University of Texas at Austin · Scalable Network Analysis Inderjit S. Dhillon University of Texas at Austin Dec 20, 2013 ... University of Texas at Austin](https://reader033.vdocuments.site/reader033/viewer/2022042218/5ec45ef24e0713237519e647/html5/thumbnails/60.jpg)
Inderjit S. Dhillon University of Texas at Austin Scalable Network Analysis
Conclusions
• Rapid growth of unstructured data demands scalable machine learning solutions for analysis
• Machine learning problems arising in network analysis can be cast in the matrix completion framework
• Our proposed asynchronous distributed algorithm NOMAD outperforms state-of-the-art matrix completion solvers
• Beyond Matrix Completion: Asynchronous distributed framework for solving machine learning problems
60