ibm storage reference architecture for ai applied to...
TRANSCRIPT
© 2018 IBM Corporation
Frank KraemerIBM Systems Architectmailto:[email protected] GTC 2018 10/2018
IBM Storage Reference Architecture for AI applied to Autonomous Driving (AD)
© 2018 IBM Corporation
Autonomous Driving = See + Think + Act
1 32
https://autoware.ai/
The Automotive Industry has to solve this highly complex problem.
© 2018 IBM Corporation
Automotive Sensor Setup for AD
3http://currencyobserver.com/2017/12/global-automotive-sensors-market-2017-2022/
Each data source: ~ 2 Gbit/sSensors sets: ~ 30 Gbit/sData collection volume: ~ 12-15 TB/h
© 2018 IBM Corporation
Automotive Industry generates large amounts of data
Sources: Images from https://www.youtube.com/watch?v=4jW0fJ80VG8https://www.youtube.com/watch?v=dhEgD6ZFlQEhttps://www.youtube.com/watch?t=21&v=39QMYkx89j0
▪ Storage of data (sensor /
video) is very costly.
▪ Handling of these data is
difficult i.e. due to high
required bandwidth.
▪ For testing purposes sensor /
video data are much more
complex in comparison to
discrete bus signals,
electronic values, etc.
Sensor / video data must be synchronously captured, stored, modified and executed with other
testing data such as CAN, FlexRay, Radar, LiDAR, HiSonic, etc. – most common formats are:
ADTF v2/3 (digitalwerk) RTMaps (Intempora) MDF4 and ROS/rosbag.
© 2018 IBM Corporation
Data Management for ADAS/AD development and test is challenging
Test Drives
50-70 TB / day / car
R&D Labs: tagging
R&D Labs: developing & testing & (re-)simulation & AI training
▪ >5 PB of data for each car project▪ 300-500 PB data in total
> 200h / 1h driving
o Europeo USAo Chinao Japano Asiao Africa
Training Data as a Service (TDaaS)
© 2018 IBM Corporation
The IBM AD Solution Approach
4. How to analyze sensor and video data with fast analytics and modern BigDatatools?
2. How to distribute data globally within an enterprise and partners?
1. How to implement & operate an efficient storage, workflow and management system?
„The Data Foundation“
3. How to preserve digital data for decades with optimized costs?
IBM Analytics HDFS
Hortonworks HDP, DSX, Spark,…
IBM AREMA
IBM High-Speed WAN File TransferIBM Aspera / Mass Data Migration / Cloud
IBM Spectrum Computing
IBM Object Storage (COS)
6. How to do efficient IT workload and resource scheduling?
IBM ‘Cold’ ArchivingIBM Spectrum Protect / Cold / Low Cost / Tape
5. How to run Machine Learning (ML) and AI training with Nvidia GPU technology at scale?
IBM Enterprise-Class AI
Power9 AC922, PowerAI, AI Vision
IBM Spectrum Discover(MetaOcean)
© 2018 IBM Corporation
• Tiering from flash, to disk, to tape, to cloud.• Cloud appears as external storage pool.• Auto Tiering & migration.• High performance Read/Write operations.• Public cloud-ready.• Support of multi cloud environments.
ICP
AWS S3
Azure
Private CloudReplicated
Compressed
Encrypted
IntegrityValidated
Transparent Cloud Tiering
Backup
DR
Tiering
Archive
Datasharing
IBM Cloud
The IBM storage architecture based on Spectrum Scale, COS and Tape
IBM Spectrum Scale (HOT)• File based storage with Object & HDFS support
• High End I/O performance
• Information Lifecycle Management (ILM)
• Sub Micro-seconds access time
IBM Cloud Object Storage (S3) (WARM)• Site Fault Tolerant
• Geo Dispersed and WW scale
• Easy to Deploy
• Milli-seconds access time
IBM Spectrum Archive & Tape (COLD)• Lowest TCO
• Tape ILM target – especially frozen archive
• Long term retention and Minutes access time
• Access as files via LTFS
• Reduced floor space requirements and energy consumption
• Up to 260PB native capacity in a single Tape Library
© 2018 IBM Corporation
Building-block ”HOT” High Performance I/O File Storage
Block
iSCSI
Client workstations Users, Containers
and applications
HPC & HTCCompute farm
Traditionalapplications
GLOBAL Namespace
Analytics
Transparent HDFS
OpenStack
Cinder
Glance
Manila
Object
Swift S3
Transparent Cloud
Powered byIBM Spectrum Scale
Automated data placement and data migration
Disk Tape Shared Nothing Cluster (FPO)
FlashNVMe
New Genapplications
Transparent Cloud Tier (TCT)
Worldwide File Data Distribution (AFM)
Site B
Site A
Site C
SMBNFS
POSIX
File
EncryptionFile AuditLoggingImmutability
DR Site
AFM-DR
JBOD/JBOF
ESS
Spectrum Scale RAID
Compression
DGX / AC922
S3 Data Cloud
Management APIAdvanced GUIRESTful API
Cloud Data Sharing
© 2018 IBM Corporation
IBM Analytics & Hortonworks (HDP) / Hadoop
https://developer.ibm.com/dwblog/2017/ibm-hortonworks-expand-partnership-help-businesses-accelerate-data-driven-decision-making/
Automotive Customer Use Case:
➢ Major automotive OEM was experiencing significant difficulties and costs associated with storing and processing huge volumes of Video, Radar and Lidar files within legacy Network Attached Storage (NAS) system.
➢ Data necessary for development of Autonomous Vehicle machine learning algorithms.
➢ Today, storing multiple Petabytes of video and binary data with HDP Data Lake, aiming to grow to the tens of Petabytes.
➢ Dramatically reduced data management costs and user productivity.
➢ Provided foundation for Autonomous Driving research.
➢ IBM Reference customer for Spectrum Scale and HDP.
© 2018 IBM Corporation
2nd Generation IBM Elastic Storage Server (ESS) Family
10
Model GL4S: 4 Enclosures, 20U
334 NL-SAS, 2 SSD
Model GL6S:6 Enclosures, 28U
502 NL-SAS, 2 SSD
Model GL2S: 2 Enclosures, 12U
166 NL-SAS, 2 SSD
Capacity
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
36 GB/s12 GB/s 24 GB/s
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
Model GS1S24 SSD
EXP3524
8
9
16
17
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
Model GS2S48 SSD
EXP3524
8
9
16
17
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
System x3650 M40 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15
EXP3524
8
9
16
17
EXP3524
8
9
16
17
EXP3524
8
9
16
17
Model GS4S96 SSD
Speed
40 GB/s
14 GB/s
26 GB/s
Model GL1S: 1 Enclosures, 9U
82 NL-SAS, 2 SSD
ESS 5U84 Storage
6 GB/s
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
ESS 5U84 Storage
38 GB/s 40 GB/s
Model GH14S: 1 2U24 Enclosure SSD4 5U84 Enclosure HDD334 NL-SAS, 24 SSD
Model GH24S: 2 2U24 Enclosure SSD4 5U84 Enclosure HDD334 NL-SAS, 48 SSD
© 2018 IBM Corporation
Presentation at ATZ Live 04/2018 in Wiesbaden, Germany
„Artifical Intelligence is key to understand Sensor Data“
„Relevant data is needed to finalize the Software Development.“
Dr. Michael Hafner, Head of Automated Driving and Active Safety at Mercedes-Benz, talks about sensors, safety, and the road map that developers are following.
https://www.daimler.com/innovation/autonomous-driving/expert-interview.html
© 2018 IBM Corporation
Workload and data flow for AI flow is complex
Traditional Business Data
Sensor Data
Data from collaboration
partners
Data from mobile app and social media
Legacy Data
Data Preparation
Pre-Processing
Training Dataset
Data Source Model Training Inference
AI Deep Learning Frameworks(Tensorflow, Caffe, …)
Monitor & Advise
Instrumentation
Iterate
Distributed & Elastic Deep Learning (Fabric)
Parallel Hyper-Parameter Search & Optimization
Network Models
Hyper-Parameters
Testing Dataset
Trained Model
Deploy in Production using Trained Model
New Data
Years of DataHours and weeks of
preparation
Weeks and months of training
Sub Seconds to results
Heavy IO
https://public.dhe.ibm.com/common/ssi/ecm/75/en/75016775usen/systems-hardware-ibm-spectrum-computing-analyst-paper-or-report-75016775usen-20180618.pdf
IBM Reference Architecture for AI Infrastructure
© 2018 IBM Corporation
Reference IBM Spectrum Scale ESS CORAL
▪ 2.5 TB/sec single stream IOR as requested from ORNL
▪ 1 TB/sec 1MB sequential read/write as stated in CORAL RFP
▪ Single Node 16 GB/sec sequential read/write as requested from ORNL
▪ 50K creates/sec per shared directory as stated in CORAL RFP
▪ 2.6 Million 32K file creates/sec as requested from ORNL
▪ Summit’s 250-petabyte storage system is delivered by a cluster of 77x
IBM ESS Storage Systems that will deliver 2.5 TBs of data.
▪ Summit will have the capacity of 30B files and 30B directories and will
be able create files at a rate of over 2.6 million I/O file operations per
second.
https://www.ibm.com/blogs/systems/fastest-storage-fastest-system-summit/
© 2018 IBM Corporation
Global Data Distribution via IBM Aspera
Automotive company synchronizes petabytes of vehicle field test data & video from on-site locations to worldwide R&D teams at high-speed with IBM Aspera FASP.
IBM Aspera for Global Data Distributionhttp://downloads.asperasoft.com/
© 2018 IBM Corporation
IBM can help
5. to guarantee long-year data verifiability and recoverability of test data with a comparable cheap tape storage solution for potential warranty cases.
1. Significantly increased development efficiency by reducing manual efforts for video tagging, eliminated wasted time for data search and manual data copy/move processes and by automating workflows.
2. Significantly increased test through-put, means allowing you to run more test cases in less time, therefore increasing time-to-market as well as the quality of your camera and ADAS products.
4. to reduce IT costs for local storage hardware by globally centralizing data in a private cloud and object store, from which project- and demand specific video data are downloaded to local test labs.
3. Increase the entire flexibilityof your organization through the ability to move work-load from one place to another.
© 2018 IBM Corporation
Question to win a prize
16
How much data does a single test/dev car generate in an 8 hour shift per day?
a) 1-5 TB per dayb) 50-70 TB per dayc) 1-5 PB per day