can you connect the dots, like the the graph below, without lifting up your ... -...
TRANSCRIPT
Confidential 1
Can you connect the dots, like the the graph below, without lifting up
your pencil?
Can you connect each dot on the top to each dot on the bottom, like
above, without crossing lines?
There are pens at the sign-in table if you want to try these out
Silicon Valley | Silicon Harbor
Can You Trust The Internet? An intro to graph invariants, computational
complexity, and graph algorithms (with a little bit of cryptography)
Denise K. Gosnell, Ph.D. Data Scientist, PokitDok
Confidential 3
Graph Theory?
G = (V,E)
Confidential 4
The Plan
1. Graph Properties and Reddit 2. Computational Complexity
P: Bipartite Graph Matching NP: Graph Coloring
3. A little bit of RSA
Confidential 5
Why should you care?
1. Everything is graph theoretic in nature.
2. Much easier to code. 1. or… you just want to be here to look cool.
Confidential 6
The Seven Bridges of Königsberg
"Konigsberg bridges" by Bogdan Giuşcă - Public domain (PD) http://commons.wikimedia.org/wiki/File:Konigsberg_bridges.png#/media/File:Konigsberg_bridges.png
Confidential 7
The Seven Bridges of Königsberg: Solved by Euler in 1735
Thank you Wikipedia: http://en.wikipedia.org/wiki/Seven_Bridges_of_Konigsberg
Confidential 8
Eulerian Paths
vs.
3
3
5 3
2
4 4
3 3
Confidential 9
screen shot from vizit: http://redditstuff.github.io/sna/vizit/
Silicon Valley | Silicon Harbor
P
Confidential 11
P The set of all decision problems that can be solved by a deterministic Turing machine using a polynomial amount of computational time
Confidential 12
Classes of Algorithm Complexity
O(1) • Accessing an element of an array • Determining if a number is even or odd
O(log(n)) • Binary search [eg: intelligently searching through a dictionary]
O(n) • Linear search: finding a min or max in an unsorted list • Graph Search
O(n2) • looking for a word in a word search • Bad sorting algorithms like bubble sort, insertion sort
…
Confidential 13
Graph Search: O(|V| + |E|)
Depth vs. Breadth
Confidential 14
Graph Matching Given a bipartite graph G = (V,E), a matching M in G is a set of edges in which no two edges share a common vertex
Confidential 15
The User Preferences Problem
Candidates à Jobs
Confidential 16
The User Preferences Problem Goal: assign candidates to jobs to fill as many jobs as possible
Confidential 17
The User Preferences Problem Greedy Algorithm:
Keep adding edges until no more edges can be added
Confidential 18
The User Preferences Problem Greedy Approach
Confidential 19
The User Preferences Problem A better solution: via augmenting paths
Confidential 20
Augmenting Paths
• A path P is M-alternating if the edges of the path alternate between being in the matching M and not in the matching M.
• A path P is M-augmenting if it is m-alternating
and the first and last edges are not in the matching M.
Confidential 21
Augmenting Paths
You can improve the matching M by:
1. Remove from the matching M the edges of the path P that are in the matching M
2. Add to the matching M the edges of the path P that are not in the matching M
3. This will have one more edge than in the matching
Confidential 22
The User Preferences Problem Augmenting Path Example:
Confidential 23
The User Preferences Problem Augmenting Path Example:
Confidential 24
Berge’s Theorem:
A matching in a graph is maximum if and only if there does not exist an augmenting path in a graph
Confidential 25
Maximal Matching by Constructing the Auxiliary Graph
Confidential 26
Maximal Matching by Constructing the Auxiliary Graph: Theorem: G has an augmenting path if and only if it has a directed path from the source to the sink in the auxiliary graph
Confidential 27
Maximal Matching by Constructing the Auxiliary Graph:
Confidential 28
Maximal Matching Final Solution: EdmondsAlgorithm(G): M = empty matching!while there is an augmenting path P for M!
!M = M +- P!output M!!AugmentingPath(G,M): G’ = Auxiliary graph for G, M!P = Path from source to sink (via BFS)!if P is null:!
!return false!else:!
!delete s and t from P and return P!
Confidential 29
Graph Algorithms in P • Maximum (minimum) degree • Finding connected components
[BFS, DFS] • Pairwise shortest path algorithms
[Dijkstra’s, Bellman-Ford, Floyd-Warshall] • Diameter • Girth [shortest cycle] • Edge Covering Number • …
Silicon Valley | Silicon Harbor
NP
Confidential 31
NP Nondeterministic Polynomial: the set of all decision problems where a “yes” instance can be verified by a non-deterministic Turing machine in polynomial time
Confidential 32
P à solvable in polynomial time
NP à verifiable in polynomial time
Confidential 33
Classic NP Problems:
Integer (prime?) Factorization
Graph Coloring (this isn’t what you think) The Knapsack Problem
The Traveling Salesman
Confidential 34
Integer (prime?) Factorization
Confidential 35
Integer (prime?) Factorization
Factoring a 232 digit number took over two years and utilized hundreds of machines. Paper: “Factorization of a 768-bit RSA modulus”. Kleinjung, et al. 2010.
Confidential 36
Graph Coloring Minimum number of colors required to color the vertices of G such that no two adjacent vertices are the same color
Confidential 37
Graph Coloring: Greedy approach
Confidential 38
Graph Coloring: Greedy approach
Confidential 39
Greedy Graph Coloring via a BFS:
ColorGraph(G,v):!!colors = []; !!let Q be a queue!!Q.enqueue(v)!!v.color = new color!!while Q is not empty:!! v ß Q.dequeue()!! for all edges from v to w:!! if w is not labeled as discovered!! ! Q.enqueue(w)!! ! label w as discovered!! ! neighbor_colors = set of color assignments ! ! !of the neighbors of w!! ! if neighbor_colors == colors:!! ! w.color = new color, update colors!! ! else: w.color = a color from !! ! ! colors - neighbor_colors !!!! ! ! ! ! !!! ! ! ! !!
Confidential 40
The Four Color Theorem Every planar graph is four colorable. 1997: Thomas
Confidential 41
Non-Planar Graphs
Can you connect each dot on the top to each dot on the bottom (like
above), without crossing lines?
Confidential 42
Finding non-planar graphs
K5 K3,3
Confidential 43
Graph Coloring The chromatic number of a graph has a constrained optimization version that is impossible to approximate within any constant factor unless P = NP. -1996: Zuckerman
Silicon Valley | Silicon Harbor
About this P =? NP.
Confidential 45
The P versus NP Problem: Essentially: can every problem whose solution can be checked by a computer in polynomial time also be solved by a computer in polynomial time? Formal Conjecture: 1971 by Stephen Cook
Confidential 46
Why should you care?
Integer (prime?) Factorization
Graph Coloring (this isn’t what you think) The Knapsack Problem
The Traveling Salesman
Confidential 47
RSA: 1. Pick two extremely large prime numbers p and q 2. Public Key: (e,n) where:
n = p � q e in [3, (p – 1)(q – 1)]
3. Private key: (d,n) where:
n = p � q e � d = 1 mod ((p – 1)(q – 1))
Confidential 48
RSA: The foundation of RSA’s security relies upon the fact that given a composite number, it is considered a hard problem to determine it’s prime factors. An NP problem, in fact.
Confidential 49
What just happened?
1. Graph Properties and Reddit 2. Computational Complexity
P: Bipartite Graph Matching NP: Graph Coloring
3. A little bit of RSA
Confidential 50
Graph Theory Resources: Introduction to Graph Theory 2nd Ed (West) Introduction to Graph Theory (Chartrand) Social Network Analysis (Wasserman) Introduction to Algorithms 3rd Edition (Cormen, …, Stein) … the Wikipedia pages aren’t too shabby.
Confidential 51
Links in the Presentation Notes: Reddit Viz: http://redditstuff.github.io/sna/vizit/# YouTube Lecture on Graph Matching: https://www.youtube.com/watch?v=NlQqmEXuiC8 Graph Matching Code: http://www.geeksforgeeks.org/maximum-bipartite-matching/ RSA Detailed Example: http://doctrina.org/How-RSA-Works-With-Examples.html Factorization of a 768-bit RSA modulus: http://eprint.iacr.org/2010/006.pdf
Confidential 52
Graph Tech Stack: Databases:
Titan Neo4J OrientDB
Visualization:
Gephi sigma.js GraphViz
Graph Tech Stack: Algorithm Libraries:
Spark Gremlin (Gremthon) Boost (c++) JGraphT (java) NetworkX Python-Graph
… there are plenty more to dive into. This is just a start.
Silicon Valley | Silicon Harbor
Can You Trust The Internet? An intro to graph invariants, computational
complexity, and graph algorithms (with a little bit of cryptography)
Denise K. Gosnell, Ph.D. Data Scientist, PokitDok
T: @DeniseKGosnell
Confidential 54
Can you connect the dots like the the graph below without lifting up
your pencil?
yes.
Can you connect each dot on the top to each dot on the bottom (like
above), without crossing lines?
no.