cse 701: fast algorithms for graph analyticssariyuce.com/sem/firstclass.pdf · •dec...
TRANSCRIPT
![Page 1: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/1.jpg)
CSE 701: Fast Algorithms for Graph
Analytics
A. Erdem Sariyuce
![Page 2: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/2.jpg)
Who am I?• My name is Erdem• Office: 323 Davis Hall• Office hours: Tuesday 10-12
• Research on graph (network) mining & management• Practical algorithms
• Streaming, distributed, parallel• Leverage the characteristics of real-world data• Fast graph analytics
![Page 3: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/3.jpg)
Heard About Big-Data?• Yes, I do that• For graphs, mostly
• Not only big, but also• Dynamic
• Incomplete
• Noisy
• Distributed
![Page 4: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/4.jpg)
Social Information
Protein-interactionRouters
4
Graphs are everywhere
![Page 5: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/5.jpg)
What’s this class about?• Mining graphs to get hidden insights• By finding patterns in complex structure• New models and algorithms
• On large data that cannot be examined manually• Computationally challenging• Fast algorithms needed
• On dynamic, incomplete, noisy data
![Page 6: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/6.jpg)
What’s this class about?• We will cover a range of topics about• Structure of real-world networks (graphs)
• Small-world
• Community structure
• …
• Practical algorithms for fast graph analytics• Centrality computation
• Community detection
• Graph partitioning
• …
![Page 7: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/7.jpg)
You?• MS or PhD? What year?
• Any particular objective in this course?
• Any research interests?
![Page 8: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/8.jpg)
Course Structure• Presentations
• Questions before class
• Discussion in class
• Literature survey (or one additional presentation)• If taking 3 credits
![Page 9: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/9.jpg)
Presentations• Each week two papers• Paper list is available at the course website
• http://sariyuce.com/seminar.html
• Each paper will be presented by a single student• No groups
• Each student will present one paper (more on this later)
![Page 10: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/10.jpg)
Before class• Questions on Piazza before each class (except presenter)• Everyone will read the papers!• I’ll post some guides on how to read a paper
• Questions are due Tuesday night 11.59pm• Open-ended• Thought-provoking• Unique• ‘What does Fig. 4 tell?’ is NOT a question
![Page 11: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/11.jpg)
In class: Presentation• For each paper;• At least 1 hour presentation
• It will be highly interactive; by me and others
• You can find slides online, ask authors if needed• But don’t rely on those too much!
• Conference presentations are only for 15/20 mins
• Citation analysis for at least 10 mins
• References: Which papers are cited in this paper?
• Cited by: Which papers have cited this paper?• Google Scholar, Microsoft Academic Search
• You can get feedback on slides/talk before the class!• Ask timely
![Page 12: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/12.jpg)
In class: Citation Analysis• References: Which papers are cited in this paper?• Briefly explain 5 references that form basis for the paper
• Cited by: Which papers have cited this paper?• Google scholar
• https://scholar.google.com/• Microsoft Academic search
• https://academic.microsoft.com/• Check the ones at top venues
• SIGKDD, WWW, WSDM, VLDB, SIGMOD, Nature, Science …
• Check the ones that got most citations• Microsoft Academic search has that
• Briefly explain 5 of those; what’s new there?
![Page 13: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/13.jpg)
In Class: Discussion• Presenter will read the posted questions• From Piazza
• And initiate discussion• Give his/her opinion, others should chip in as well
• Class participation points will be earned here
• I might force you to be a volunteer :)
• Each class is 150 mins, so we have plenty of time
![Page 14: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/14.jpg)
Grading is S/U• This is a seminar class! 75% is needed for an S
• 1 or 2 credits;• Paper presentation: 40%• Piazza questions: 30%• Class participation: 30%
• 3 credits;• Paper presentation: 40%• Piazza questions: 20%• Class participation: 20%• Literature survey (individual): 20%
• Or one additional presentation (we have 26 papers in total)• 16 students in total; 8 with 3 credits• Students can do an extra presentation to waive the literature survey
![Page 15: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/15.jpg)
Literature Survey• On a particular subject• Find, read, and summarize/categorize the previous work
• Talk to me for the topic
• Report is required by the end of semester• Update on 6th or 7th week, will let you know
• If done well;• We can go for a paper!• Survey papers are cited most, can be quite impactful
![Page 16: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/16.jpg)
Paper List and Schedule•We will decide on Piazza
• I’ll post it at the end of the class
•First Come First Serve, be quick!
![Page 17: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/17.jpg)
Papers• Sep 11: Four Degrees of Separation
• Web Science Conference, 2012
• Cited by +500
• Small-world phenomena
• Analysis of the entire Facebook network!
• Sep 11: Graph structure in the Web• Computer Networks, 2000
• Cited by +3900
• Characterizes the web
• Bow-tie structure
![Page 18: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/18.jpg)
Papers• Sep 18: Statistical properties of community structure in large
social and information networks• World Wide Web Conference (WWW), 2008• Cited by +860• Detailed analysis of real communities in a variety of domains• Interesting conclusions on community size
• Sep 18: Uncovering the overlapping community structure of complex networks in nature and society• Nature, 2005• Cited by +5000• Overlapping communities• Clique based formulation
![Page 19: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/19.jpg)
Papers• Sep 25: Authoritative Sources in a Hyperlinked Environment
• Journal of ACM, 1999• Cited by +9400• Hubs and authorities• One of the most influential works
• Sep 25: Graphs over time: densification laws, shrinking diameters and possible explanations• SIGKDD, 2005• Cited by +2200• Best-paper in ’05, Test-of-time award in ‘16• First work on graph evolution
![Page 20: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/20.jpg)
Papers• Oct 2: The Link Prediction Problem for Social Networks
• JASIST 2007• Cited by +9400• Hubs and authorities• One of the most influential works
• Oct 2: Simplicial closure and higher-order link prediction• PNAS 2018• Beyond pair-wise• Higher-order relationships• Adapting triangle closure
![Page 21: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/21.jpg)
Papers• Oct 9: A Fast and High Quality Multilevel Scheme for Partitioning
Irregular Graphs• SIAM Journal on Scientific Computing (SISC), 1998• Cited by +4800, widely used• Ground-breaking work on graph partitioning• Efficient multi-level heuristics
• Oct 9: Experimental Analysis of Streaming Algorithms for Graph Partitioning• SIGMOD, 2019• Streaming graph partitioning• Comparative survey• Very important for production-level deployments
![Page 22: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/22.jpg)
Papers• Oct 16: Incremental k-core decomposition: algorithms and
evaluation• Very Large Data Bases Journal (VLDBJ), 2016• Maintaining graph analytics for streaming graphs• k-core decomposition is a fundamental operation• Density pointers
• Oct 16: A Fast Order-Based Approach for Core Maintenance• ICDE, 2017• Better performance than the paper above• Additional indexing• Runtime vs. space trade-off
![Page 23: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/23.jpg)
Papers• Oct 23: Maximizing the Spread of Influence through a Social
Network• SIGKDD 2003• Cited by +6300• Formalizing the viral marketing• Very influential
• Oct 23: Influence Maximization on Social Graphs: A Survey• TKDE, 2018• Comprehensive survey• All follow-ups since the paper above
![Page 24: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/24.jpg)
Papers• Oct 30: Signed Networks in Social Media
• CHI 2010• Cited by +940• What happens if we have +/- labels on edges?• Structural balance theory verified
• Oct 30: A Survey of Signed Network Mining in Social Media• ACM CSUR 2016• Another comprehensive survey• Covers all graph mining works on signed networks
![Page 25: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/25.jpg)
Papers• Nov 6: Finding a Maximum Density Subgraph
• Berkeley TR 1984• Beautiful theory paper• Finding a subgraph with largest average degree• Influenced many works
• Nov 6: Denser than the Densest Subgraph: Extracting Optimal Quasi-Cliques with Quality Guarantees• SIGKDD 2013• How to generalize the paper above for triangles?• Novel quasi-clique formulation
![Page 26: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/26.jpg)
Papers• Nov 13: Trusses: Cohesive Subgraphs for Social Network Analysis
• NSA TR 2008• Generalization of k-core model to triangles• Influenced many works
• Nov 13: Finding the Hierarchy of Dense Subgraphs using Nucleus Decompositions• WWW 2015• Unification of core/truss models for higher orders• Hierarchical dense subgraph discovery
![Page 27: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/27.jpg)
Papers• Nov 20: Network Motifs: Simple Building Blocks of Complex
Networks• Science, 2002
• Cited by +6100
• Small induced subgraphs
• Fundamental units of complex networks
• Nov 20: Uncovering Biological Network Function via Graphlet Degree Signatures• Cancer Informatics, 2008
• Cited by +270
• Extends degree concept to motifs
• Very simple local statistics to capture the node function
![Page 28: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/28.jpg)
Papers• Nov 27: A Faster Algorithm for Betweenness Centrality
• Journal of Mathematical Sociology, 2001• Cited by +3100• Finding nodes that are central in the graph• Reduces complexity from O(V^3) to O(V.E)
• Nov 27: Centrality and network flow• Social Networks 2004 • Cited by +2700• Considers flows on edges• Not true that all paths are equally useful
![Page 29: CSE 701: Fast Algorithms for Graph Analyticssariyuce.com/sem/firstclass.pdf · •Dec 4:Higher-order organization of complex networks •Science, 2016 •Cited by +150 •Analyzing](https://reader033.vdocuments.site/reader033/viewer/2022042320/5f0a6c5f7e708231d42b8fb1/html5/thumbnails/29.jpg)
Papers• Dec 4: Higher-order organization of complex networks
• Science, 2016• Cited by +150• Analyzing the higher-order structures (Not pair-wise relations)• How triangles and other small motifs impact the structure
• Dec 4: Representing higher-order dependencies in networks• Science Advances, 2016 • A different approach to model higher-order structures• Non-Markovian property