introduction to word embeddings with python
TRANSCRIPT
![Page 1: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/1.jpg)
Introduction to word embeddings
Pavel Kalaidin@facultyofwonder
Moscow Data Fest, September, 12th, 2015
![Page 2: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/2.jpg)
![Page 3: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/3.jpg)
![Page 4: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/4.jpg)
distributional hypothesis
![Page 5: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/5.jpg)
лойс
![Page 6: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/6.jpg)
годно, лойслойс за песню
из принципа не поставлю лойсвзаимные лойсы
лойс, если согласен
What is the meaning of лойс?
![Page 7: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/7.jpg)
годно, лойслойс за песню
из принципа не поставлю лойсвзаимные лойсы
лойс, если согласен
What is the meaning of лойс?
![Page 8: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/8.jpg)
кек
![Page 9: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/9.jpg)
кек, что ли?кек)))))))ну ты кек
What is the meaning of кек?
![Page 10: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/10.jpg)
кек, что ли?кек)))))))ну ты кек
What is the meaning of кек?
![Page 11: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/11.jpg)
vectorial representations of words
![Page 12: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/12.jpg)
simple and flexible platform for
understanding text and probably not messing up
![Page 13: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/13.jpg)
one-hot encoding?
1 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0
![Page 14: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/14.jpg)
co-occurrence matrix
recall: word-document co-occurrence matrix for LSA
![Page 15: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/15.jpg)
credits: [x]
![Page 16: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/16.jpg)
from entire document to window (length 5-10)
![Page 17: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/17.jpg)
still seems suboptimal -> big, sparse, etc.
![Page 18: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/18.jpg)
lower dimensions, we want dense vectors
(say, 25-1000)
![Page 19: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/19.jpg)
How?
![Page 20: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/20.jpg)
matrix factorization?
![Page 21: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/21.jpg)
SVD of co-occurrence matrix
![Page 22: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/22.jpg)
lots of memory?
![Page 23: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/23.jpg)
idea: directly learn low-dimensional vectors
![Page 24: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/24.jpg)
here comes word2vec
Distributed Representations of Words and Phrases and their Compositionality, Mikolov et al: [paper]
![Page 25: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/25.jpg)
idea: instead of capturing co-occurrence counts
predict surrounding words
![Page 26: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/26.jpg)
Two models:C-BOW
predicting the word given its context
skip-grampredicting the context given a word
Explained in great detail here, so we’ll skip it for now Also see: word2vec Parameter Learning Explained, Rong, paper
![Page 27: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/27.jpg)
![Page 28: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/28.jpg)
CBOW: several times faster than skip-gram, slightly better accuracy for the frequent wordsSkip-Gram: works well with small amount of
data, represents well rare words or phrases
![Page 29: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/29.jpg)
Examples?
![Page 30: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/30.jpg)
![Page 31: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/31.jpg)
![Page 32: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/32.jpg)
![Page 33: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/33.jpg)
![Page 34: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/34.jpg)
![Page 35: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/35.jpg)
![Page 36: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/36.jpg)
Wwoman- Wman= Wqueen- Wking
classic example
![Page 37: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/37.jpg)
<censored example>
![Page 38: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/38.jpg)
word2vec Explained: Deriving Mikolov et al.’s Negative-Sampling Word-Embedding Method, Goldberg et al, 2014 [arxiv]
![Page 39: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/39.jpg)
all done with gensim:github.com/piskvorky/gensim/
![Page 40: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/40.jpg)
...failing to take advantage of the vast amount of repetition
in the data
![Page 41: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/41.jpg)
so back to co-occurrences
![Page 42: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/42.jpg)
GloVe for Global VectorsPennington et al, 2014: nlp.stanford.
edu/pubs/glove.pdf
![Page 43: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/43.jpg)
Ratios seem to cancel noise
![Page 44: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/44.jpg)
The gist: model ratios with vectors
![Page 45: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/45.jpg)
The model
![Page 46: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/46.jpg)
Preserving linearity
![Page 47: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/47.jpg)
Preventing mixing dimensions
![Page 48: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/48.jpg)
Restoring symmetry, part 1
![Page 49: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/49.jpg)
recall:
![Page 50: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/50.jpg)
![Page 51: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/51.jpg)
Restoring symmetry, part 2
![Page 52: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/52.jpg)
Least squares problem it is now
![Page 53: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/53.jpg)
SGD->AdaGrad
![Page 54: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/54.jpg)
ok, Python code
![Page 55: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/55.jpg)
glove-python:github.com/maciejkula/glove-python
![Page 56: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/56.jpg)
two sets of vectorsinput and context + bias
average/sum/drop
![Page 57: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/57.jpg)
complexity |V|2
![Page 58: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/58.jpg)
complexity |C|0.8
![Page 59: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/59.jpg)
Evaluation: it works
![Page 60: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/60.jpg)
#spb#gatchina#msk#kyiv#minsk#helsinki
![Page 61: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/61.jpg)
Compared to word2vec
![Page 62: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/62.jpg)
#spb#gatchina#msk#kyiv#minsk#helsinki
![Page 63: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/63.jpg)
![Page 64: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/64.jpg)
t-SNE:github.com/oreillymedia/t-SNE-tutorial
seaborn:stanford.edu/~mwaskom/software/seaborn/
![Page 65: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/65.jpg)
Abusing models
![Page 66: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/66.jpg)
music playlists:github.com/mattdennewitz/playlist-to-vec
![Page 68: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/68.jpg)
user interestsParagraph vectors: cs.stanford.
edu/~quocle/paragraph_vector.pdf
![Page 69: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/69.jpg)
predicting hashtagsinteresting read: #TAGSPACE: Semantic
Embeddings from Hashtags [link]
![Page 70: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/70.jpg)
RusVectōrēs: distributional semantic models for Russian: ling.go.mail.ru/dsm/en/
![Page 71: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/71.jpg)
![Page 72: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/72.jpg)
corpus matters
![Page 73: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/73.jpg)
building block forbigger models╰(*´︶`*)╯
![Page 74: Introduction to word embeddings with Python](https://reader031.vdocuments.site/reader031/viewer/2022021423/5886a8bc1a28ab0c1d8b7947/html5/thumbnails/74.jpg)
</slides>