Download - Frameworks - UMIACS
![Page 1: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/1.jpg)
Frameworks
Natural Language Processing: JordanBoyd-GraberUniversity of MarylandINTRODUCTION
Slides adapted from Chris Dyer, Yoav Goldberg, Graham Neubig
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 1 / 1
![Page 2: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/2.jpg)
Neural Nets and Language
Language
Discrete, structured (graphs, trees)
Neural-Nets
Continuous: poor native support forstructure
Big challenge: writing code that translates between the{discrete-structured, continuous} regimes
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 2 / 1
![Page 3: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/3.jpg)
Why not do it yourself?
� Hard to compare with exting models
� Obscures difference between model and optimization
� Debugging has to be custom-built
� Hard to tweak model
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 3 / 1
![Page 4: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/4.jpg)
Outline
� Computation graphs (general)
� Neural Nets in PyTorch
� Full example
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 4 / 1
![Page 5: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/5.jpg)
Computation Graphs
Expression
~x
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 6: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/6.jpg)
Computation Graphs
Expression
~x>
� Edge: function argument / data dependency
� A node with an incoming edge is a function F ≡ f (u ) edge’s tail node
� A node computes its value and the value of its derivative w.r.t eachargument (edge) times a derivative ∂ f
∂ u
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 7: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/7.jpg)
Computation Graphs
Expression
~x>A
Functions can be nullary, unary, binary, . . . n-ary. Often they are unary orbinary.
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 8: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/8.jpg)
Computation Graphs
Expression
~x>Ax
Computation graphs are (usually) directed and acyclic
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 9: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/9.jpg)
Computation Graphs
Expression
~x>Ax
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 10: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/10.jpg)
Computation Graphs
Expression
~x>Ax + b · ~x + c
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 11: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/11.jpg)
Computation Graphs
Expression
y = ~x>Ax + b · ~x + c
Variable names label nodesNatural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 5 / 1
![Page 12: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/12.jpg)
Algorithms
� Graph construction� Forward propagation� Loop over nodes in topological order� Compute the value of the node given its inputs� Given my inputs, make a prediction (i.e. “error” vs. “target output”)
� Backward propagation� Loop over the nodes in reverse topological order, starting with goal node� Compute derivatives of final goal node value wrt each edge’s tail node� How does the output change with small change to inputs?
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 6 / 1
![Page 13: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/13.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 14: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/14.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 15: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/15.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 16: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/16.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 17: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/17.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 18: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/18.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 19: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/19.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 20: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/20.jpg)
Forward Propagation
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 7 / 1
![Page 21: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/21.jpg)
Constructing Graphs
Static declaration
� Define architecture, run datathrough
� PROS: Optimization, hardwaresupport
� CONS: Structured data ugly,graph language
Theano, Tensorflow
Dynamic declaration
� Graph implicit with data
� PROS: Native language,interleave construction/evaluation
� CONS: Slower, computation canbe wasted
Chainer, Dynet, PyTorch
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 8 / 1
![Page 22: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/22.jpg)
Constructing Graphs
Static declaration
� Define architecture, run datathrough
� PROS: Optimization, hardwaresupport
� CONS: Structured data ugly,graph language
Theano, Tensorflow
Dynamic declaration
� Graph implicit with data
� PROS: Native language,interleave construction/evaluation
� CONS: Slower, computation canbe wasted
Chainer, Dynet, PyTorch
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 8 / 1
![Page 23: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/23.jpg)
Language is Hierarchical
![Page 24: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/24.jpg)
Dynamic Hierarchy in Language
� Language is hierarchical� Graph should reflect this reality� Traditional flow-control best for processing
� Combinatorial algorithms (e.g., dynamic programming)
� Exploit independencies to compute over a large space of operationstractably
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 10 / 1
![Page 25: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/25.jpg)
PyTorch
� Torch: Facebook’s deep learning framework
� Nice, but written in Lua (C backend)
� Optimized to run computations on GPU
� Mature, industry-supported framework
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 11 / 1
![Page 26: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/26.jpg)
Why GPU?
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 12 / 1
![Page 27: Frameworks - UMIACS](https://reader034.vdocuments.site/reader034/viewer/2022050512/6271f48db27bf005c45f1d1b/html5/thumbnails/27.jpg)
Why GPU?
CPU
GPUHDD
Natural Language Processing: Jordan Boyd-Graber | UMD Frameworks | 12 / 1