lake crest について調べてみた
TRANSCRIPT
Intel Lake Crest Yutaka Yasuda, Kyoto Sangyo University, 2016/12/16
AI 2016.3 AlphaGO vs
2016.9 Google (AI)
2015 Google Photo
“Google's AlphaGo AI Continues to Wallop Expert Human Go Player”, Popular Mechanics, 2016/3/10http://www.popularmechanics.com/technology/a19863/googles-alphago-ai-wins-second-game-go/
Deep Learning
2014 ImageNet Google 20
2012 Google
”Deep Visual-Semantic Alignments for Generating Image Descriptions”, Andrej Karpathy, Li Fei-Fei, Stanford University, CVPR 2015
Neural Network = Neuron
https://en.wikipedia.org/wiki/Artificial_neural_network
“Introduction to multi gpu deep learning with DIGITS 2”, Mike Wanghttp://www.slideshare.net/papisdotio/introduction-to-multi-gpu-deep-learning-with-digits-2-mike-wang/6
“Introduction to multi gpu deep learning with DIGITS 2”, Mike Wanghttp://www.slideshare.net/papisdotio/introduction-to-multi-gpu-deep-learning-with-digits-2-mike-wang/6
“Introduction to multi gpu deep learning with DIGITS 2”, Mike Wanghttp://www.slideshare.net/papisdotio/introduction-to-multi-gpu-deep-learning-with-digits-2-mike-wang/6
https://www.youtube.com/watch?v=BMEffRAvnk4
Why nVIDIA?
Lake Crest
Intel Artificial Intelligence Day2016/11/17 -12:30 PM PT San Francisco
http://pc.watch.impress.co.jp/docs/column/ubiq/1030981.html
Intel Nervana Engine
https://www.nervanasys.com/technology/engine/
ASIC
CPU ASIC GPU ASIC
Wikipedia
“ ASIC ”
Nervana Engine Web
2.5D
Blazingly fast data access via high-bandwith memory (HBM)
Processing Cluster x12 (3x4)ICL (Inter Chip Link) x128GB HBM2 x4
HBM?
An Introduction to HBM - High Bandwidth Memory - Stacked Memory and The Interposer http://www.guru3d.com/articles-pages/an-introduction-to-hbm-high-bandwidth-memory,2.html
• HBMDRAM
•
• GPU Interposer
• 2.5D
GDDR5 HBM2
32-bit Bus With 1024-bit
Up-to 1750 MHz (7 Gbps) 2 Gbps
Up-to 28 GB/s per chip 125GB/s (2Tb/s) per unit
1.5V 1.3V
LGA 2011: CPU 2011
Xeon E5 1600/2600 v4 Broadwell-EP 2000 1024 x4
→ Wikipedia: LGA 2011
http://pc.watch.impress.co.jp/docs/column/ubiq/1030981.html
Tensor
https://www.tensorflow.org
“TensorFlow: Large-Scale Machine Learning on Heterogeneous Distributed Systems”, Abdai, et. al, 2015,
https://arxiv.org/abs/1603.04467v2
https://www.tensorflow.org/tutorials/mnist/beginners/https://en.wikipedia.org/wiki/
Artificial_neural_network
or CPU
Nervana Engine ASIC
Tensor
HBM2 4 unit
HBM 1024bit!
2.5D
Nervana Engine
12
100Gbit/s
https://www.nervanasys.com/technology/engine/
100Gbit/s *12
Deep Learning GPU GPU
GPU SIMD“ ”
http://logmi.jp/45705
GPU SIMD
GPU 32bit
AI GPU
nVIDIA CPU
https://www.tensorflow.org/tutorials/mnist/beginners/
GPU Nervana Engine
Binary Neural Network
GPU 32bit
BNN - Binarized Neural Network ( -1 / +1 )
Nervana Accelerating Neural Networks with Binary Arithmetic
https://www.nervanasys.com/accelerating-neural-networks-binary-arithmetic/
“Accelerating Neural Networks with Binary Arithmetic” (blog post)
These 32 bit floating point multiplications, however, are very expensive.
In BNNs, floating point multiplications are supplanted with bitwise XNORs and left and right bit shifts.
This is extremely attractive from a hardware perspective:
binary operations can be implemented computationally efficiently at a low power cost.
Nervana website (blog post)https://www.nervanasys.com/accelerating-neural-networks-binary-arithmetic/
32bit
BNN XNOR bit shift
Nervana Engine GPU SIMD
BNN (ASIC)
XNOR -1 0, +1 1
Tensor
GPU nVIDIA
Intel Xeon Phi http://www.4gamer.net/games/049/G004963/20161007061/
Intel Nervana Engine
https://software.intel.com/en-us/blogs/2013/avx-512-instructions
Deep Learning
nVIDIA GPU
Deep Learning
Nervana Binalized HBM2
nVIDIA FP16
Intel AVX-512 SIMD
Google TPU (Tensor Processing Unit) 8bit CPU!
XNOR /
CPU 100Gbps
SIMD
'You've got to find what you love,' Jobs saysSteve Jobs, 2005, Stanford University
https://www.youtube.com/watch?v=UF8uR6Z6KLc
“Follow your heart”