quantum natural language processing€¦ · quantum natural language processing lee james...
TRANSCRIPT
Intel® QS Scaling
References
1. William Zeng and Bob Coecke, “Quantum Algorithms for Compositional Natural Language Processing”, EPTCS 221, 2016.
2. Joachim Lambek, “From word to sentence”, Polimetrica, Milan, 2008.
3. Nathan Wiebe, Ashish Kapoor, and Krysta M. Svore, “Quantum Algorithms for Nearest-Neighbor Methods for Supervised and Unsupervised Learning”, Quantum Information and Computation, 2014.
4. Mikhail Smelyanskiy, Nicolas P. D. Sawaya, Alán Aspuru-Guzik, “qHiPSTER: The Quantum High Performance Software Testing Environment”, arXiv:1601.07195, 2016.
Intel® Quantum Simulator
Intel® Quantum Simulator[4]
– Quantum High Performance Software Testing Environment.– Distributed high-performance implementation of a quantum
simulator on a classical computer.– Out-of-the-box simulation of general single-qubit and two-
qubit (controlled) gates.• Rotation, Hadamard, Pauli, Square Root of Pauli, Toffoli, SWAP,
Square Root of SWAP.– Single- and double-precision for qubit registers.
Implementation– Single-node and multi-node implementations of qubit gate
operations.– Multi-node implementation distributes state vectors to fit
per-node memory capacity to store states with optimised memory usage for improved communication.
Optimisations– Vectorisation, multi-threading, gate fusion.
Methodology
Quantum Natural Language ProcessingLee James O’Riordan1, Myles Doyle1, Venkatesh Kannan1 , Fabio Baruffa2
1Irish Centre for High-End Computing, Ireland2Intel Deutschland GmbH
Background
Natural Language Processing– Sentiment analysis, relationship extraction, word sense
disambiguation, automatic summary generation.– Traditional “bag of words” approach
• Lacks information about grammatical rules of language.• Increase in problem complexity reduces quality of results.
– “Distributed compositional semantics” approach[1]
• Grammatically informed algorithms compute sentence meanings.• Implementation requires large computational classical resources.
Quantum Computing– Potential to offer dramatic speedup to algorithms which can
exploit quantum parallelism.– Requires quantum versions of classical algorithms.– Application development
• Limited by scale, reliability and coherence of quantum devices.• Software quantum simulators allow proof of concept applications.
Project Objective– Implement quantum versions of distributed compositional
semantics algorithms to analyse sentence meaning.– Develop and evaluate solution on the Intel® Quantum
Simulator[4] deployed in Irish national supercomputer “Kay”.
Partnership– Irish Centre for High-End Computing & Intel® Corporation.– Co-funded by Enterprise Ireland & Intel® Ireland.
Project Execution– January 2019 to March 2020.
Distributed Compositional Semantics
Quantum Advantage
Contact Information
Distributional Model– Algorithms based on “bag of words” approach.– Meaning of words represented by frequencies of “nearby”
words in a corpus.
Compositional Model[2]
– Algorithms derive meaning of sentences or phrases from known meanings of component words.
– Embeds types of words and grammatical structure.
Unified DisCo Model[1]
– Combines both approaches to introduce grammatical form to the composition of word meanings.
– Allows computing meanings of two sentences and decide if their meanings match.
INSERTLOGO HERE
Irish Centre for High-End Computing (ICHEC)– Dr. Lee James O’Riordan, [email protected]– Mr. Myles Doyle, [email protected]– Dr. Venkatesh Kannan, [email protected]
Intel Deutschland GmbH– Dr. Fabio Baruffa, [email protected]
Classical vs. Quantum Storage Requirements
1 transitive verb 10K transitive verbs
Classical 1 GB 10 TB
Quantum 33 qubits 47 qubits
Monoidal categories(type specification and reduction; meaning
representation and composition)
Meaning vector spaces
Pregroup for types
Language
objects
meaning
axioms
type logic
Problem Mapping1. Represent category theoretic data structures and NLP
operations using Dirac notation.2. Define quantum versions of DisCo algorithms from literature
using Dirac notation.3. Map elements from Dirac notation to quantum circuit
notation (gates/registers/circuits).4. Map elements from quantum circuit notation to the Intel®
Quantum Simulator paradigm.
Results: Comparing Sentences
Mary(subject)
John(object)
likes(verb)
Category theoreticgraphical notation
Quantum DiracRepresentation
Example:DisCo model
sentence meaning
Quantum circuit
notation
|Maryi ⌦
0
@X
i,j,k
ci,j,k|ii|ji|ki
1
A⌦ |Johni
!X
i,j,k
ci,j,khMary|ii|jihJohn|ki
!X
j
dj |ji<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
|likesi =X
i,j,k
ci,j,k|ii|ji|ki<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
| i<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
=<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
f(| i)<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Intel® Quantum Simulator
X[1] X[�1] X[1]
X[1] X[�1] X[1]
X[1]
X[0] X[0] X[0] X[0] X[0]
| 1i
X[�1] X[1]
X[1] X[�1] X[1]
X[1] X[�1] X[1]
X[0] X[0] X[0] X[0] X[0]
R
. . .<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
. . .<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Two-tier implementation1. Intel-QS bindings and quantum encoding implemented in
C++ layer (red).2. Pre-processing, analysis and plotting implemented in
Python (blue).3. External dependencies for respective layers indicated by
colour gradients (green/red => C++, green/blue => Python).
Quantum computing applications can be written entirely in C++, or Python. The Python layer allows for a single-ended application development environment, or standard cluster submission with respective Python scripts.
Intel® Quantum Simulator on Kay
Kay– 336 node cluster.– 13,440 CPU cores, 63 TB distributed memory.– Dual-socket 20-core Intel® Xeon Gold (Skylake) 6148 at
2.4 GHz with 192 GB memory.– 400 GB local SSD scratch.– 100 GB Intel® OmniPath network.– Additional partitions
• Dual NVIDIA Tesla V100.• Intel Xeon Phi (Knights Landing architecture).• High-memory 1.5 TB RAM with 1TB local SSD scratch.
Simulation on Kay– Up to 33 qubits on single-node executions.– Up to 41 qubits on Kay’s main partition.
.�i�b2i hQF2M "BM�`vns �/mHi yyns +?BH/ yRns bKBi? Ryns bm`;2QM RRv bi�M/ yyv KQp2 yRv bBi Ryv bH22T RRno BMbB/2 yno QmibB/2 R
h�#H2 R, "�bBb /�i�<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
.�i�b2i hQF2M ai�i2ns CQ?M (|00i+ |10i)/
p2
ns J�`v (|01i+ |11i)/p2
v r�HF (|00i+ |01i)/p2
v `2bi (|10i+ |11i)/p2
no BMbB/2 |0ino QmibB/2 |1i
h�#H2 k, a2Mi2M+2 /�i� 2M+Q/BM; mbBM; #�bBb 7`QK h�#H2 RX<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
adult,sit,inside |00100i
smith,sit,inside |10100i
adult,sleep,inside |00110i
smith,sleep,inside |10110i
child,stand,outside |01001i
surgeon,stand,outside |11001i
child,move,outside |01011i
surgeon,move,outside |11011i
0.00
0.05
0.10
0.15
0.20
P(a
dult,st
and,in
side)
1/p
n
Even superposition measurement
Post-selection measurement
adult,sit,inside |00100i
smith,sit,inside |10100i
adult,sleep,inside |00110i
smith,sleep,inside |10110i
child,stand,outside |01001i
surgeon,stand,outside |11001i
child,move,outside |01011i
surgeon,move,outside |11011i
0.00
0.05
0.10
0.15
0.20
P(s
mith,st
and,ou
tsid
e)
1/p
n
Even superposition measurement
Post-selection measurement
adult,sit,inside |00100i
smith,sit,inside |10100i
adult,sleep,inside |00110i
smith,sleep,inside |10110i
child,stand,outside |01001i
surgeon,stand,outside |11001i
child,move,outside |01011i
surgeon,move,outside |11011i
0.00
0.05
0.10
0.15
0.20
P(a
dult,st
and,in
side)
1/p
n
Even superposition measurement
Post-selection measurement
"�bBb<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit> .�i�
<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
dzCQ?M `2bib QmibB/2- �M/ J�`v r�HFb BMbB/2Ǵ<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Data encoding and computation• Sentence data is mapped onto the chosen basis, defining the
quantum state for encoding.• Quantum state representing meaning is encoded onto
quantum simulator.• Hamming distance calculated between test and encoded data.• State amplitudes adjusted to reflect Hamming distance.• Bit-string outcome is measured.• Simulation is repeated to build state-distribution knowledge.
dzJ�`v r�HFb BMbB/2Ǵ ! 12 (|01001i+ |01011i+ |11001i+ |11011i)
<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
dzCQ?M `2bib QmibB/2Ǵ ! 12 (|00100i+ |00110i+ |10100i+ |10110i)
<latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit><latexit sha1_base64="(null)">(null)</latexit>
Preliminary results
• (Left) The closest encoded quantum-state vector to the data data “adult(s) stand inside”. The smallest Hamming distance calculated is “adult(s) sit inside”.
• (Right) Similarly, the closest state for “smith(s) stand outside” gives “surgeon(s) stand outside”.
The results of this computation are entirely context-dependent, and by choosing different corpora one may obtain different relative meanings between queried test data.
Strong Scaling– Approximate linear decrease in runtime which slows as
number of processes becomes large (MPI communication overhead) as the work per process decreases.
Weak Scaling– Non-linear increase in runtime caused by the increase in
MPI Communications as the workload increases.
Implementation