loom and graphs in clojure
DESCRIPTION
Graph algorithms are cool and fascinating. We'll look at a graph algorithms and visualization library, Loom, which is written in Clojure. We will discuss the graph API, look at implementation of the algorithms and learn about the integration of Loom with Titanium, which allows us to run the algorithms on and visualize data in graph databases.TRANSCRIPT
Loom and Graphs in Clojure
github.com/aysylu/loom
Aysylu Greenberg @aysylu22; http://aysy.lu
LispNYC, August 13th 2013
Overview
• Loom's Graph API
• Graph Algorithms in Loom
• Titanium Loom
• Single Static Assignment (SSA) Loom
Overview
• Loom's Graph API
• Graph Algorithms in Loom
• Titanium Loom
• SSA Loom
Loom's Graph API
• Graph, Digraph, Weighted Graph
Loom's Graph API
• Graph, Digraph, Weighted Graph • FlyGraph
Loom's Graph API
• Graph, Digraph, Weighted Graph • FlyGraph o read-only, ad-hoc
Loom's Graph API
• Graph, Digraph, Weighted Graph • FlyGraph o read-only, ad-hoc o edges from nodes + successors
Loom's Graph API
• Graph, Digraph, Weighted Graph • FlyGraph o read-only, ad-hoc o edges from nodes + successors o nodes and edges from successors + start
Loom's Graph API
• Uses Clojure protocols (clojure.org/protocols)
Loom's Graph API
• Uses Clojure protocols (clojure.org/protocols) o specification only, no implementation
Loom's Graph API
• Uses Clojure protocols (clojure.org/protocols) o specification only, no implementation o single type can implement multiple
protocols
Loom's Graph API
• Uses Clojure protocols (clojure.org/protocols) o specification only, no implementation o single type can implement multiple
protocols o interfaces: design-time choice of the type
author, protocols: can be added to a type at runtime
Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges")
Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges”) (remove-nodes* [g nodes] "Remove nodes from graph g. See
remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See
remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g")
Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See
remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See
remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected
graph")
Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See
remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See
remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected
graph”) (has-node? [g node] "Return true when node is in g”) (has-edge? [g n1 n2] "Return true when edge [n1 n2] is in g")
Loom's Graph API (defprotocol Graph (add-nodes* [g nodes] "Add nodes to graph g. See add-nodes” (add-edges* [g edges] "Add edges to graph g. See add-edges” (remove-nodes* [g nodes] "Remove nodes from graph g. See
remove-nodes”) (remove-edges* [g edges] "Removes edges from graph g. See
remove-edges”) (remove-all [g] "Removes all nodes and edges from graph g” (nodes [g] "Return a collection of the nodes in graph g”) (edges [g] "Edges in g. May return each edge twice in an undirected
graph”) (has-node? [g node] "Return true when node is in g”) (has-edge? [g n1 n2] "Return true when edge [n1 n2] is in g”) (successors [g] [g node] "Return direct successors of node, or
(partial successors g)”) (out-degree [g node] "Return the number of direct successors of
node"))
Loom's Graph API (defprotocol Digraph (predecessors [g] [g node] "Return direct
predecessors of node, or (partial predecessors g)”) (in-degree [g node] "Return the number direct
predecessors to node")
Loom's Graph API (defprotocol Digraph (predecessors [g] [g node] "Return direct
predecessors of node, or (partial predecessors g)”) (in-degree [g node] "Return the number direct
predecessors to node”) (transpose [g] "Return a graph with all edges
reversed"))
Loom's Graph API (defprotocol WeightedGraph (weight [g] [g n1 n2] "Return weight of edge [n1 n2]
or (partial weight g)"))
Overview
• Loom's Graph API
• Graph Algorithms in Loom
• Titanium Loom
• SSA Loom
Graph Algorithms in Loom • DFS/BFS (+ bidirectional)
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford)
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju)
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju) • Density (edges/nodes)
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju) • Density (edges/nodes) • Loner nodes
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju) • Density (edges/nodes) • Loner nodes • 2 coloring
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju) • Density (edges/nodes) • Loner nodes • 2 coloring • Max-Flow (Edmonds-Karp)
Graph Algorithms in Loom • DFS/BFS (+ bidirectional) • Topological Sort • Single Source Shortest Path (Dijkstra, Bellman-Ford) • Strongly Connected Components (Kosaraju) • Density (edges/nodes) • Loner nodes • 2 coloring • Max-Flow (Edmonds-Karp) • alg-generic requires only successors + start (where
appropriate)
Graph Algorithms: Bellman-Ford
A B C
D E
3 4
5
2
-8
Graph Algorithms: Bellman-Ford
CLRS Introduction to Algorithms
Graph Algorithms: Bellman-Ford
CLRS Introduction to Algorithms
Graph Algorithms: Bellman-Ford (defn- init-estimates "Initializes path cost estimates and paths from source to all vertices, for
Bellman-Ford algorithm” [graph start] (let [nodes (disj (nodes graph) start)
path-costs {start 0} paths {start nil} infinities (repeat Double/POSITIVE_INFINITY) nils (repeat nil) init-costs (interleave nodes infinities) init-paths (interleave nodes nils)]
[(apply assoc path-costs init-costs) (apply assoc paths init-paths)]))
Graph Algorithms: Bellman-Ford
Graph Algorithms: Bellman-Ford
Graph Algorithms: Bellman-Ford (defn- can-relax-edge? "Test for whether we can improve the shortest path to v found so far by
going through u.” [[u v :as edge] weight costs] (let [vd (get costs v)
ud (get costs u) sum (+ ud weight)]
(> vd sum)))
Graph Algorithms: Bellman-Ford (defn- relax-edge "If there's a shorter path from s to v via u, update our map of
estimated path costs and map of paths from source to vertex v” [[u v :as edge] weight [costs paths :as estimates]] (let [ud (get costs u)
sum (+ ud weight)] (if (can-relax-edge? edge weight costs)
[(assoc costs v sum) (assoc paths v u)] estimates)))
Graph Algorithms: Bellman-Ford
Graph Algorithms: Bellman-Ford (defn- relax-edges "Performs edge relaxation on all edges in weighted directed graph” [g start estimates] (->> (edges g)
(reduce (fn [estimates [u v :as edge]] (relax-edge edge (wt g u v) estimates)) estimates)))
Graph Algorithms: Bellman-Ford (defn bellman-ford
"Given a weighted, directed graph G = (V, E) with source start, the Bellman-Ford algorithm produces map of single source shortest paths and their costs if no negative-weight cycle that is reachable from the source exits, and false otherwise, indicating that no solution exists." [g start] (let [initial-estimates (init-estimates g start) ;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (-> g nodes count dec range)) edges (edges g)] (if (some (fn [[u v :as edge]] (can-relax-edge? edge (wt g u v) costs)) edges) false [costs (->> (keys paths) ;remove vertices that are unreachable from source (remove #(= Double/POSITIVE_INFINITY (get costs %))) (reduce (fn [final-paths v] (assoc final-paths v ; follows the parent pointers ; to construct path from source to node v (loop [node v path ()] (if node (recur (get paths node) (cons node path)) path)))) {}))])))
Graph Algorithms: Bellman-Ford (defn bellman-ford "Given a weighted, directed graph G = (V, E) with source start,
the Bellman-Ford algorithm produces map of single source shortest paths and their costs if no negative-weight cycle that is reachable from the source exits, and false otherwise, indicating that no solution exists."
Graph Algorithms: Bellman-Ford [g start] (let [initial-estimates (init-estimates g start)
;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (->> g (nodes) (count) (dec) (range))) edges (edges g)]
Graph Algorithms: Bellman-Ford [g start] (let [initial-estimates (init-estimates g start)
;relax-edges is calculated for all edges V-1 times [costs paths] (reduce (fn [estimates _] (relax-edges g start estimates)) initial-estimates (->> g (nodes) (count) (dec) (range))) edges (edges g)]
Graph Algorithms: Bellman-Ford (if (some (fn [[u v :as edge]] (can-relax-edge? edge (wt g u v) costs))
edges) false
Graph Algorithms: Bellman-Ford [costs
(->> (keys paths) ;remove vertices that are unreachable from source (remove
#(= Double/POSITIVE_INFINITY (get costs %)))
Graph Algorithms: Bellman-Ford [costs
(->> (keys paths) ;remove vertices that are unreachable from source (remove
#(= Double/POSITIVE_INFINITY (get costs %))) (reduce (fn [final-paths v] (assoc final-paths v ; follows the parent pointers
; to construct path from source to node v (loop [node v path ()] (if node (recur (get paths node) (cons node path)) path)))) {}))])))
Overview
• Loom's Graph API
• Graph Algorithms
• Titanium Loom
• SSA Loom
Titanium Loom
• Titanium by Clojurewerkz (titanium.clojurewerkz.org)
Titanium Loom
• Titanium by Clojurewerkz (titanium.clojurewerkz.org)
• Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)
Titanium Loom
• Titanium by Clojurewerkz (titanium.clojurewerkz.org)
• Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)
• Various storage backends: Cassandra, HBase, BerkeleyDB Java Edition
Titanium Loom
• Titanium by Clojurewerkz (titanium.clojurewerkz.org)
• Clojure graph library built on top of Aurelius Titan (thinkaurelius.github.com/titan)
• Various storage backends: Cassandra, HBase, BerkeleyDB Java Edition
• No graph visualization
Titanium Loom (let [in-mem-graph (tg/open {"storage.backend" "inmemory"})]
(tg/transact!
(let [
a (nodes/create! {:name "Node A"})
b (nodes/create! {:name "Node B"})
c (nodes/create! {:name "Node C"})
Titanium Loom (let [in-mem-graph (tg/open {"storage.backend" "inmemory"})]
(tg/transact!
(let [
a (nodes/create! {:name "Node A"})
b (nodes/create! {:name "Node B"})
c (nodes/create! {:name "Node C"})
e1 (edges/connect! a "edge A->B" b)
e2 (edges/connect! b "edge B->C" c)
e3 (edges/connect! c "edge C->A” a)
graph (titanium->loom in-mem-graph)])
Titanium Loom (view graph)
Titanium Loom (defn titanium->loom "Converts titanium graph into Loom representation” ([titanium-graph & {:keys [node-fn edge-fn weight-fn]
:or {node-fn (nodes/get-all-vertices) edge-fn (map (juxt edges/tail-vertex edges/head-vertex) (edges/get-all-edges)) weight-fn (constantly 1)}}]
(let [nodes-set (set node-fn) edges-set (set edge-fn)]
Titanium Loom (reify Graph (nodes [_] nodes-set) (edges [_] edges-set)
(has-node? [g node] (contains? (nodes g) node)) (has-edge? [g n1 n2] (contains? (edges g) [n1 n2])) (successors [g] (partial successors g)) (successors [g node] (filter (nodes g)
(seq (nodes/connected-out-vertices node)))) (out-degree [g node] (count (successors g node)))
Titanium Loom (reify Graph (nodes [_] nodes-set) (edges [_] edges-set)
(has-node? [g node] (contains? (nodes g) node)) (has-edge? [g n1 n2] (contains? (edges g) [n1 n2])) (successors [g] (partial successors g)) (successors [g node] (filter (nodes g)
(seq (nodes/connected-out-vertices node)))) (out-degree [g node] (count (successors g node))) Digraph (predecessors [g] (partial predecessors g)) (predecessors [g node] (filter (nodes g) (seq (nodes/connected-in-vertices node)))) (in-degree [g node] (count (predecessors g node))) WeightedGraph (weight [g] (partial weight g)) (weight [g n1 n2] (weight-fn n1 n2))))))
Overview
• Loom's Graph API
• Graph Algorithms
• Titanium Loom
• SSA Loom
SSA Loom
• Single Static Assignment (SSA) form produced by core.async
SSA Loom
• Single Static Assignment (SSA) form produced by core.async
• Generated by parse-to-state-machine function
SSA Loom (parse-to-state-machine '[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))])
SSA Loom (parse-to-state-machine '[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))])
[inst_4938 {:current-block 76, :start-block 73, :block-catches {76 nil, 75 nil, 74 nil, 73 nil}, :blocks {76 [{:value :clojure.core.async.impl.ioc-macros/value, :id
inst_4937} {:value inst_4937, :id inst_4938}], 75 [{:refs [clojure.core/+ x 2], :id inst_4935} {:value inst_4935, :block 76, :id inst_4936}], 74 [{:refs [clojure.core/+ x 1], :id inst_4933} {:value inst_4933, :block 76, :id inst_4934}], 73 [{:refs [clojure.core/+ x 1 2 y], :id inst_4930} {:refs [clojure.core/> inst_4930 0], :id inst_4931} {:test inst_4931, :then-block 74, :else-block 75, :id
inst_4932}]}}]
SSA Loom (def ssa (->> (parse-to-state-machine
'[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))]) second :blocks))
SSA Loom {76 [{:value :clojure.core.async.impl.ioc-macros/
value, :id inst_4937} {:value inst_4937, :id inst_4938}], 75 [{:refs [clojure.core/+ x 2], :id inst_4935} {:value inst_4935, :block 76, :id inst_4936}], 74 [{:refs [clojure.core/+ x 1], :id inst_4933} {:value inst_4933, :block 76, :id inst_4934}], 73 [{:refs [clojure.core/+ x 1 2 y], :id inst_4930} {:refs [clojure.core/> inst_4930 0], :id
inst_4931} {:test inst_4931, :then-block 74, :else-block
75, :id inst_4932}]}}]
(def ssa (->> (parse-to-state-machine
'[(if (> (+ x 1 2 y) 0) (+ x 1) (+ x 2))]) second :blocks))
SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn))
SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn))
SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) (defn ssa->loom "Converts the SSA form generated by core.async into Loom
representation.” ([ssa node-fn edge-fn] (let [nodes (delay (node-fn ssa))
edges (delay (edge-fn ssa))]
SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) {:graph (reify Graph
(nodes [g] @nodes) (edges [g] @edges) (has-node? [g node] (contains? @nodes node)) (has-edge? [g n1 n2] (contains? @edges [n1 n2])) (successors [g] (partial successors g)) (successors [g node]
(->> @edges (filter (fn [[n1 n2]] (= n1 node))) (map second))) (out-degree [g node] (count (successors g node)))
SSA Loom (view (ssa->loom ssa ssa-nodes-fn ssa-edges-fn)) Digraph
(predecessors [g] (partial predecessors g)) (predecessors [g node]
(->> @edges (filter (fn [[n1 n2]] (= n2 node))) (map first))) (in-degree [g node] (count (predecessors g node))))
:data ssa})))
SSA Loom: Dataflow Analysis
• For each basic block, solve system of equations until reaching fixed point:
• Use worklist approach
SSA Loom: Dataflow Analysis (defn dataflow-analysis "Performs dataflow analysis. Nodes have value nil initially.” [& {:keys [start graph join transfer]}] (let [start (cond
(set? start) start (coll? start) (set start) :else #{start})]
SSA Loom: Dataflow Analysis (loop [out-values {}
[node & worklist] (into clojure.lang.PersistentQueue/EMPTY start) (let [in-value (join (mapv out-values (predecessors graph node)))
out (transfer node in-value) update? (not= out (get out-values node)) out-values (if update? (assoc out-values node out) out-values) worklist (if update? (into worklist (successors graph node)) worklist)]
SSA Loom: Dataflow Analysis (loop [out-values {}
[node & worklist] (into clojure.lang.PersistentQueue/EMPTY start (let [in-value (join (mapv out-values (predecessors graph node)))
out (transfer node in-value) update? (not= out (get out-values node)) out-values (if update? (assoc out-values node out) out-values) worklist (if update? (into worklist (successors graph node)) worklist)]
(if (seq worklist) (recur out-values worklist) out-values)))))
SSA Loom: Global Availability (defn global-cse [ssa] (let [{graph :graph node-data :data} (ssa->loom (:blocks ssa) ssa-nodes-fn ssa-edges-fn) start (:start-block ssa)] (letfn [(pure? [instr] (contains? instr :refs)) (global-cse-join [values] (if (seq values) (apply set/intersection values) #{})) (global-cse-transfer [node in-value] (into in-value (map :refs (filter pure? (node-data node)))))]
SSA Loom: Global Availability (defn global-cse [ssa] (let [{graph :graph node-data :data} (ssa->loom (:blocks ssa) ssa-nodes-fn ssa-edges-fn) start (:start-block ssa)] (letfn [(pure? [instr] (contains? instr :refs)) (global-cse-join [values] (if (seq values) (apply set/intersection values) #{})) (global-cse-transfer [node in-value] (into in-value (map :refs (filter pure? (node-data node)))))] (dataflow-analysis :start start :graph graph :join global-cse-join :transfer global-cse-transfer))))
SSA Loom: Dataflow Analysis
• Reaching definitions
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination)
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:
o Erdős number
SSA Loom: Dataflow Analysis
• Reaching definitions • Liveness analysis (dead code elimination) • Available expressions • Constant propagation • Other Applications:
o Erdős number o Spread of information in systems (e.g. taint)
My Experience
• Intuitive way to implement algorithms functionally
• Some mental overhead of transforming data structures
Open Questions
• How general should a graph API be?
Open Questions
• How general should a graph API be? • How feature-rich should a graph API be?