graph neural networks - chrome.ws.dei.polimi.it

56
Graph neural networks February 2018 Visin Francesco

Upload: others

Post on 01-Oct-2021

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Graph neural networks - chrome.ws.dei.polimi.it

Graph neural networksFebruary 2018

Visin Francesco

Page 2: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Who I am

Page 3: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Outline

● Motivation and examples● Graph nets

○ (Semi)-formal definition○ Interaction network○ Relation network○ Gated graph sequence neural network○ Attention is all you need

● Implementation example● Conclusions

Page 4: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Motivation● The modern machine learning toolkit is

well-suited to data that is:○ Fixed-length (MLPs)○ Sequential (RNNs) ○ Spatial (CNNs)

Page 5: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Motivation● The modern machine learning toolkit is

well-suited to data that is:○ Fixed-length (MLPs)○ Sequential (RNNs) ○ Spatial (ConvNets)

● But there is not much to support graph structured data.

Image Credit - Diane Harris Cline, Vossman

Page 6: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Motivation● The world is complex, yet very

structured

Page 7: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Motivation● The world is complex, yet very

structured● Many things and problems are

fundamentally relational

Page 8: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Motivation● The world is complex, yet very

structured● Many things and problems are

fundamentally relational● Many problems can be naturally

represented as graphs.

Page 9: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Decoding structure

Reasoning about structure

Inferring structure from unstructured data

Tasks on graphs categorization

Inferring structure

Reasoning

Decoding

Page 10: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Image Credit - Leibe et Al.

Categorization examples

Parsing language Visual scene recognition

DecodingInferring Reasoning

Page 11: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Physical state prediction

Image Credit - Battaglia et al.

Categorization examplesDecodingInferring Reasoning

Molecule structure generation

Image Credit - Li et al.

Page 12: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Categorization examplesDecodingInferring Reasoning

Drug toxicity prediction Finding the treasure

Image Credit - Andrew Doane and lastspark, from Noun ProjectImage Credit - Greg Rodgers

Page 13: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graphs & Computer Science/Machine Learning● Graph algorithms

○ Dijkstra○ Bellman-Ford

Image Credit - Artyom Kalinin

Page 14: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graphs & Computer Science/Machine Learning● Graph algorithms

○ Dijkstra○ Bellman-Ford

● Hand-crafted features, graph kernels, etc ..○ Measure similarity between

graphs

Image Credit -Ghosh et al. Image Credit - Vishwanathan et al.

Page 15: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graphs & Computer Science/Machine Learning● Graph algorithms

○ Dijkstra○ Bellman-Ford

● Hand-crafted features, graph kernels, etc ..○ Measure similarity between

graphs● Graphical models

○ Restricted Boltzmann Machines○ Deep Belief Networks○ Deep Boltzmann Machines

Image Credit - Geoffrey Hinton

Page 16: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graphs & Computer Science/Machine Learning● Graph algorithms

○ Dijkstra○ Bellman-Ford

● Hand-crafted features, graph kernels, etc ..

● Graphical models○ Restricted Boltzmann Machines○ Deep Belief Networks○ Deep Boltzmann Machines

● Graph neural nets!

Image Credit - Back to the future!

Page 17: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Right! ….but what is a Graph net??

Page 18: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Growing interest:○ Graph Neural Networks (Scarselli et al 2007; 2008)○ Pointer Networks (Vinyals et al 2015)○ Graph Convolutional Networks (Bruna et al 2013; Duvenaud et al 2015; Henaff et

al 2015; Kipf & Welling 2016; Schlichtkrull et al 2017; Defferrard et al 2017)○ Gated Graph Neural Networks (Li et al 2015)○ Interaction Networks (Battaglia et al 2016; Watters et al 2017;Raposo et al 2017; )○ Message Passing Networks (Gilmer et al. 2017)○ Deep Generative Models of Graphs (Li et al. 2018)

Graph netsLiterature (partial)

Page 19: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Graph nets (GNs) are a class of models that: ○ Use graphs as

■ inputs and/or ■ outputs and/or ■ latent representation

Graph netsHigh level

Page 20: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Graph nets (GNs) are a class of models that: ○ Use graphs as

■ inputs and/or ■ outputs and/or ■ latent representation

○ Manipulate graph-structured representations

Graph netsHigh level

Page 21: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Graph nets (GNs) are a class of models that: ○ Use graphs as

■ inputs and/or ■ outputs and/or ■ latent representation

○ Manipulate graph-structured representations

○ Share model components across entities and relations

Graph netsHigh level

Page 22: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Task: Is there a golden object in the scene?

Page 23: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Task: Is there a golden object in the scene?

MLP: Inefficient learning; needs to see object of interest in all possible positions

Page 24: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Task: Is there a golden object in the scene?

MLP: Inefficient learning; needs to see object of interest in all possible positions

ConvNet: efficient; applies same function at each position (e.g. is the golden object at position x?) and then pools over the outcomes

Page 25: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Task: Are there two red objects close to each other?

Page 26: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Task: Are there two red objects close to each other?

ConvNet: Use a kernel large enough (e.g. 2x2 for this) and check if one has 2 red object in its receptive field

Observed patches by the 2x2 kernel are:

Note: within the patch, convnet needs to experience all permutations

Page 27: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Edges

Node states

Graph Nets: Think of your observation as a graph

Page 28: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

The number of edges grows quadratically !!!

Graph Nets: Think of your observation as a graph

Page 29: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Represent the node sparsely, by detecting the

objects of interest

Still O(n2) edges!

Graph Nets: Think of your observation as a graph

Page 30: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea Use a sparse representation

for the edges, if you know which objects are allowed to

interact

Graph Nets: Think of your observation as a graph

Page 31: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

Reason in terms of pairs !

Page 32: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsCore idea

● Same function re-used for every edge to compute its effect

● Same function re-used for every node to compute its new state given sum of all incoming effects and its state

● Invariant to the position of objects !

● Invariant to the distance between objects in topological spaceReason in terms of pairs !

Page 33: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Edge effects:○ edge type○ <related node states>○ global state

Graph netsStates update

Page 34: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Edge effects: ○ edge type○ <related node states>○ global state

● Node update:○ node state○ summed effects on incoming edges ○ global state

Graph netsStates update

Sum is commutative and associative!

Page 35: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Edge effects: ○ edge type○ <related node states>○ global state

● Node update:○ node state○ summed effects on incoming edges ○ global state

● Global state update:○ <node states>○ global state

Graph netsGlobal state update

Page 36: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsComponents

● G = <O, R> Graph

● O = {o1, o2, .., om} Collection of objects

○ o_i = {oi1, oi

2, …, oin} Object i with n features

● R = {r1, r2, .., rk} Collection of relations between objects

○ rk = <oi, oj, ak> Relation k between oi and oj with attribute ak

● E = {ek} Collection of effects of relations

● X = {xl} Collection of external effects

Page 37: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Graph netsComponents

out = A(g( O( (X, R(m(G))))))Graph

Marshalling functionRelation model: predict the effectsExternal effects

Aggregation function

Object model

Aggregation function

Abstraction model

Page 38: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Interaction networks

out = A(g( O( (X, R(m(G))))))Graph

Marshalling functionRelation model: predict the effectsExternal effects

Objects

Aggregation function

Object model

Aggregation function

Abstraction model

MLP

Fixed! Matrix prod + concat

MLP columnwise

Fixed! Matrix prod

MLP columnwise

Fixed! Matrix prod

Page 39: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Reason about interactions between objects● Physical reasoning

○ n-body domain: galaxy of objects that interact○ bouncing balls: balls in a box bounce on walls and can collide○ string comprised of masses connected by springs

● Learnable physics engine simulation!● Works/transfers to variable number of objects!● Graph to graph

out = A(g( O( (X, R(m(G))))))

Interaction networks

Page 40: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Interaction networks

Image Credit - Battaglia et al.

Page 41: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Interaction networks

Page 42: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Interaction networks

Page 43: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Relation networks

Implicit (fixed: tuples)

Not modeled

Page 44: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Relation networks

MLP columnwise

Any aggregation function commutative and associative

MLP

Page 45: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Reason about objects and their relations○ Classification of scenes from graph representation○ Classification of scenes from raw input○ Exploit the relations to perform one-shot relation learning

on a new scene● Graph to vectors

out = A(g( O( (X, R(m(G))))))

Relation networks

Page 46: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Relation networks on CLEVR

Page 47: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Gated graph sequence neural networks

Connectivity function

LinearSum over nodes + soft self attention

GRU

Page 48: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Reason about verification of computer programs● Attempt to prove code properties, such as memory safety● BABI task (language comprehension)● Per-node predictions● Gated (GRU style) updates to the nodes● Internal propagation steps while generating outputs sequentially● First follow up of Scarselli 2008

out = A(g( O( (X, R(m(G))))))

Gated graph sequence neural networks

Page 49: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

out = A(g( O( (X, R(m(G))))))

Attention is all you need

Different input tokens (nodes)

Masking (optional)

Linear (resulting in key, query & value)

Attention (including accumulation); head Concat

Page 50: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

● Consider each time-step (input or hidden state) as a node● Reason about the inner relations in its hidden state (over time)

○ Exploit self attention as a form of recursive memory○ Multiple attention heads in parallel (eq. to multiple edges per same nodes in IN)

● Not explicitly introduced as a graph network approach● Equivalent to Graph Convolutional Net, or Graph net with a fully connected

graph but with attention on the edges● Graph to vector

out = A(g( O( (X, R(m(G))))))

Attention is all you need

Page 51: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Cool! Where can I get one??

Page 52: Graph neural networks - chrome.ws.dei.polimi.it

Graph Propagation Core# node states and edgesstates = tf.placeholder(tf.float32)edges = tf.placeholder(tf.int32)

# use sonnet or your favorite toolkit to build# feedforward networkseffect_net = snt.Sequential(...)node_net = snt.Sequential(...)

# compute effects / messages along each edge, edge# features can be fed into the model too if availablestates_from = tf.gather(states, edges[:, 0])states_to = tf.gather(states, edges[:, 1])concat_states = tf.concat([states_from, states_to], axis=-1)effects = effect_net(concat_states)

# aggregate incoming effects by a sum, or averageaggregated_effects = tf.unsorted_segment_sum(effects, edges[:, 1], tf.shape(states)[0])

# update the node statesstates = node_net(tf.concat([aggregated_effects, states], axis=-1))

0

1

2

3

states0 1

1 2

3 1

edges

0123

(0, 1)(1, 2)(3, 1)

effects sum effects

0123

new states

from to

Page 53: Graph neural networks - chrome.ws.dei.polimi.it

Output Modules# graph-level outputgraph_vec = tf.reduce_sum(states, axis=0)graph_output = graph_level_network(graph_vec)... # feed into your favorite loss

# node-level outputnode_outputs = node_level_net(states)... # feed into your favorite loss

# edge-level outputstates_from = tf.gather(states, edges[:, 0])states_to = tf.gather(states, edges[:, 1])concat_states = tf.concat([states_from, states_to], axis=-1)edge_outputs = edge_level_net(concat_states)... # feed into your favorite loss

Graph-level output

Node-level output

Edge-level output

Page 54: Graph neural networks - chrome.ws.dei.polimi.it

Output Modules# relational network style graph-level output

# concat_states <- paired states for all (i,j) pairseffects = effect_net(concat_states)graph_vec = tf.reduce_sum(effects, axis=0)graph_output = graph_level_network(graph_vec)... # feed into your favorite loss

Relation network style graph-level output

Page 55: Graph neural networks - chrome.ws.dei.polimi.it

Graph Nets — Francesco Visin

Conclusions

● Graph nets are a powerful model to reason on graph related structures● There are three main categories of tasks in this domain:

○ vector to graph○ graph to graph○ graph to vector

● Several variants of graph nets have been proposed in the literature● They are easy to implement!● Sky is the limit: surprise us!!!

Page 56: Graph neural networks - chrome.ws.dei.polimi.it

THANK YOUCredits

Razvan Pascanu, Victor Bapst