text mining lab (summer 2017) - word vector representation
Post on 21-Jan-2018
37 Views
Preview:
TRANSCRIPT
Summer 2017Elvis Saravia
PhD, Information Systems and Applicationsellfae@gmail.com
Github username: omarsarQuestions: sli.do (#Z217)
2
●
●
●
●
●
●
● Knowledge Discovery (KDD) Process
3
4
5
ConceptNet6
●●●
7
Motel = [0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0]Hotel = [0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 0]
●●
One-hot representation
8
hotel = [0.728 0.234 -0.23 0.223]
Distributed representation (low-dimension vector)9
10
Paper source: https://arxiv.org/pdf/1301.3781.pdf
11
Paper source: https://arxiv.org/pdf/1301.3781.pdf
Feedforward Neural Net Language Model (NNLM)
variables to optimizedenotes window range
12
13
P(the|over)P(fox|over)P(jumped|over)P(the|over)P(lazy|over)P(dog|over)
P(VOUT | VIN)How to define this prob. distribution?
Determines similarity in [-1,1]
Get a probability in [0,1] out of a similarity in [-1,1]
14
15https://www.healthvault.com/en-us/health-bot/
16
● https://goo.gl/ppHX65
●○ Gensim guide for word2vec: https://goo.gl/i2UrdH
● https://goo.gl/7b72S9
●● https://goo.gl/uNJDrs
●
17
18
19
20
21
22
23
● https://goo.gl/KYacjz
●●●●●
● https://goo.gl/JezgYg
●
24
a. Build API: (Flask/Django recommended)b. Pretrained models: (Guide: https://goo.gl/5qt2Ki)c. Visualization: d3js / plotly / tensorboard
a. LSTM - (Guide: http://colah.github.io/posts/2015-08-Understanding-LSTMs/)b. CNN - (Guide: https://goo.gl/PgLUs7)c. RNN - (Guide: https://goo.gl/5L9kci
a. Starting point:https://rare-technologies.com/word2vec-tutorial#app
25
top related