DEEP LEARNING FOR NLP
IVAN VERGILIEV
• @IvanVergiliev
• IvanVergiliev.github.io
WHAT IS DEEP LEARNING?
IT’S BASICALLY A NEURAL NETWORK…
BUT WITH LOTS OF HIDDEN LAYERS
ISN’T THAT FROM THE ‘80S?
IT’S ACTUALLY FROM THE ‘40S
AS WELL AS SOME NEW IDEASe.g. Convolutional Neural Nets
WHY DO WE NEEDNLP
NATURAL LANGUAGES ARE HARD
LOOK AT JAVASCRIPT
Developed by some of the most innovative
developers
BULGARIAN~1300 years
> 7 million just today
ESPERANTO IS PROBABLY EASIER
WORD FREQUENCIES
• wow, a word cloud
COUNT WORD OCCURRENCES
APPLICATIONS
• Spelling correction
• Optical Character Recognition
• Speech Recognition
TEXT GENERATORSnot very useful, but fun
HACKERNEWS TITLES
• How Facebook is killing Linux on the desktop
• Facebook claims it can read your e-mail without a data plan
• Only a few countries are teaching children how to drive customers away
http://what-would-i-say.com/
WORD CLASSES
• 2014 was a good year.
• <YEAR> was a good year.
NOT GOOD ENOUGH
• I like cake
• I love pie
PART OF SPEECH TAGGING
The lecturer criticised the person.
NEED A REPRESENTATIONOF MEANING
DISTRIBUTED REPRESENTATION
• city = [-0.5, 0.3, …, 0.7]
• town = [-0.52, 0.35, …, 0.8]
TRAIN A NEURAL
NETWORK
THE VECTORS ACTUALLY MAKE SENSE
MORE SENSE THAN EXPECTED
W(‘’WOMAN")−W(‘‘MAN")≃
W(‘‘AUNT")−W(‘‘UNCLE")
W(‘’WOMAN”)−W(‘‘MAN")≃
W(‘‘QUEEN")−W(‘‘KING")
NEURAL LANGUAGE MODELS
FEEDFORWARD NEURAL NETWORK BASED
LANGUAGE MODEL
It even sounds fancy
FEEDFORWARD NEURAL NETWORK BASED LANGUAGE MODEL
LOCAL CONTEXT ONLY
Yesterday - the third day of the month - I went out.
RECURRENT NEURAL NETWORK
WHERE DO WE GO NOW?
PARAGRAPH VECTORS
WHY DEFINE WORDS AT ALL?
Can’t we learn from raw data like the image nets do?
TEXT UNDERSTANDING FROM SCRATCH
SHARED REPRESENTATIONS
CAN WE PUT OTHER THINGS IN THE SAME SPACE?
BUT WHY JUST TEXT?
AUTOMATEDIMAGE
CAPTIONING
“WHAT I LEARNED FROM COMPETING AGAINST A
CONVNET ON IMAGENET”
“AWW, A CUTE DOG!”
HUMAN WON
A TOUGH RACE THOUGH
BREAKING NEURAL NETWORKS
NOT A PANDAANYMORE
THANKS!
QUESTIONS?
REFERENCES
• http://colah.github.io/posts/2014-07-Conv-Nets-Modular/
• http://www.fit.vutbr.cz/~imikolov/rnnlm/thesis.pdf
• http://cs.stanford.edu/~quocle/paragraph_vector.pdf
• http://arxiv.org/pdf/1502.01710v2.pdf