generalizing convolutional neural networks to graph ...dzeng/bios740/walker_bios740.pdf ·...
TRANSCRIPT
Generalizing Convolutional Neural Networks to Graph-
structured DataBen Walker
Department of Mathematics, UNC-Chapel Hill5/4/2018
Overview• Relational structure in data and how to approach it
• Defferrard, Bresson, Vandergheynst 2016: Fast spectral filter method
• Kipf, Welling 2017: A first-order simplification for improved performance
• Discussion
Unstructured Data
Unstructured DataName Alice Bob
Age 14 65
Gender F M
Smokes? N Y
Gender M F
Smokes? Y N
Age 65 14
Name Bob Alice
Unstructured Data
• The order is irrelevant to processing - there is no prescribed relationship between the variables
Name Alice Bob
Age 14 65
Gender F M
Smokes? N Y
Gender M F
Smokes? Y N
Age 65 14
Name Bob Alice
Unstructured Data
• The order is irrelevant to processing - there is no prescribed relationship between the variables
• Use a fully-connected network to learn the relationships
Name Alice Bob
Age 14 65
Gender F M
Smokes? N Y
Gender M F
Smokes? Y N
Age 65 14
Name Bob Alice
Grid-structured Data
Grid-structured DataA kitten
Grid-structured DataA kitten Google Vision Results
Grid-structured DataA kitten Google Vision Results Same Kitten, Different Order
Grid-structured Data
• Reordered kitten picture is unintelligible
• Use a convolutional neural network to reduce parameters
A kitten Google Vision Results Same Kitten, Different Order
Graph-structured Data
• There is some relationship between data, which is given on an input-specific basis, not known a priori
• What can you use here?
Graph Convolutional Network, (Kipf and Welling 2017)
Defferrard et al 2016
Defferrard et al 2016• Spectral method allows for robust application to the
“neighborhood” of a node.
Defferrard et al 2016• Spectral method allows for robust application to the
“neighborhood” of a node.
L = D �W
Defferrard et al 2016• Spectral method allows for robust application to the
“neighborhood” of a node.
L = D �W y =K�1X
k=0
✓kLkx
Defferrard et al 2016• Spectral method allows for robust application to the
“neighborhood” of a node.
y =K�1X
k=0
✓kTk(L)xL =2
�max
L� In
L = D �W y =K�1X
k=0
✓kLkx
Defferrard et al 2016• Spectral method allows for robust application to the
“neighborhood” of a node.
• This “filtering” that maps x to y is the equivalent of the convolution step in a standard convolutional network - K parameters to learn.
y =K�1X
k=0
✓kTk(L)xL =2
�max
L� In
L = D �W y =K�1X
k=0
✓kLkx
Defferrard et al 2016y =
K�1X
k=0
✓kTk(L)x
Defferrard et al 2016
• Localized - kth term in sum includes contribution up to k hops from the node
y =K�1X
k=0
✓kTk(L)x
Defferrard et al 2016
• Localized - kth term in sum includes contribution up to k hops from the node
• Recursive definition, allowing for efficient computation
y =K�1X
k=0
✓kTk(L)x
Tk+1(L)x = 2LTk(L)x� Tk�1(L)x
Defferrard et al 2016
• Localized - kth term in sum includes contribution up to k hops from the node
• Recursive definition, allowing for efficient computation
• This filter is something we can apply machine learning techniques to
y =K�1X
k=0
✓kTk(L)x
Tk+1(L)x = 2LTk(L)x� Tk�1(L)x
Validation• Chebyshev filter Graph CNN tested on MNIST
• Graph created to represent grid structure
• Comparable performance to classical CNN
• Also validated on 20NEWS text categorization dataset.
Kipf, Welling 2017
Kipf, Welling 2017• Aim to improve the approach from Defferrard
• Linearize the previous filter equation
y = ✓
00x� ✓
01D
� 12AD
� 12x
Kipf, Welling 2017• Aim to improve the approach from Defferrard
• Linearize the previous filter equation
• Simplify and renormalize for improved numerical stability, and generalize to multiple feature maps to get an equation
y = ✓
00x� ✓
01D
� 12AD
� 12x
Z = D� 12 AD� 1
2X⇥
Kipf, Welling 2017• Aim to improve the approach from Defferrard
• Linearize the previous filter equation
• Simplify and renormalize for improved numerical stability, and generalize to multiple feature maps to get an equation
y = ✓
00x� ✓
01D
� 12AD
� 12x
Z = D� 12 AD� 1
2X⇥
Xk+1 = � (MXk⇥k)
Validation• Validation Datasets
• Citeseer, Cora, and Pubmed citation networks
• NELL knowledge graph
Comparison of classification accuracy percentage of different methods. (Kipf Welling 2017)
Discussion
Discussion• Graph-structured data is an interesting new frontier for
machine-learning methods
Discussion• Graph-structured data is an interesting new frontier for
machine-learning methods
• Kipf and Welling GCN is very similar to standard neural network formulations
Xk+1 = � (MXk⇥k)
Discussion• Graph-structured data is an interesting new frontier for
machine-learning methods
• Kipf and Welling GCN is very similar to standard neural network formulations
• By nature of linearization, it is localized at a distance of 1.
Xk+1 = � (MXk⇥k)
References
Defferrard, Michaël, Xavier Bresson, and Pierre Vandergheynst. "Convolutional neural networks on graphs with fast localized spectral filtering." Advances in Neural Information Processing Systems. 2016.
Kipf, Thomas N., and Max Welling. "Semi-supervised classification with graph convolutional networks." arXiv preprint arXiv:1609.02907 (2016).