data will change business, but will it really change ict?

35
0 Copyright 2014 FUJITSU Human Centric Innovation Fujitsu Forum 2014 ICM Munich 19th – 20th November

Upload: fujitsu-global

Post on 30-Jun-2015

332 views

Category:

Technology


2 download

DESCRIPTION

Speakers: Mr. Naoki Akaboshi (Fujitsu Laboratories)

TRANSCRIPT

Page 1: Data Will Change Business, But Will It Really Change ICT?

0 Copyright 2014 FUJITSU

Human CentricInnovation

Fujitsu Forum2014

ICM Munich19th – 20th November

Page 2: Data Will Change Business, But Will It Really Change ICT?

1 Copyright 2014 FUJITSU

Data Will Change Business, But Will It Really Change ICT?

Naoki AkaboshiFUJITSU LABORATORIES LTD.

Page 3: Data Will Change Business, But Will It Really Change ICT?

2 Copyright 2014 FUJITSU LABORATORIES LTD.

Agenda

Data-Driven Innovation

Data-Centric ICT Platform

Research Topics

Scalable Storage for High-Throughput Data

Shingled erasure coding for fast recovery from multiple disk failures

Workload-aware Flash Storage

Summary

Page 4: Data Will Change Business, But Will It Really Change ICT?

3 Copyright 2014 FUJITSU LABORATORIES LTD.

Data-Driven Innovation

Page 5: Data Will Change Business, But Will It Really Change ICT?

4 Copyright 2014 FUJITSU LABORATORIES LTD.

Background: Data Explosion

Rapid Growth of Data in Digital Universe

Doubling in size every two years

Unstructured data explosion

Source: EMC Digital Universe with Research & Analysis by IDC 2014http://www.emc.com/leadership/digital-universe/2014iview/executive-summary.htm

2013

4.4ZB

2020

44ZB

Page 6: Data Will Change Business, But Will It Really Change ICT?

5 Copyright 2014 FUJITSU LABORATORIES LTD.

Examples of Data in Digital Universe

Amazon sold 426 items per second in 2013 Christmas

340 million Twitter tweets per second

Google processes 38,000 search queries every second

100 hours of video are uploaded to YouTube every minute

Internet users send 204 million emails per minute

4.5 billion Facebook likes generated daily

Sources: Twitter, Google Zeitgeist 2012, YouTube, Facebook

Page 7: Data Will Change Business, But Will It Really Change ICT?

6 Copyright 2014 FUJITSU LABORATORIES LTD.

Data-Driven Innovation

Data-driven innovation is changing business

Capacity

Latency

MissionCritical

DWH

High-Performance DB

Big Data Platform

CloudStorage

Real-Time RecommendationBig Data Analysis

Online Video / Photo SharingCloud/Cold Storage

Real-Time BiddingIn-Memory

Velocity

Variety

Volume

Page 8: Data Will Change Business, But Will It Really Change ICT?

7 Copyright 2014 FUJITSU LABORATORIES LTD.

Examples of Innovation by Big Data

“Akisai” Cloud Service for Agriculture

Big data is used for cultivation scheduling

Fujitsu succeeded to harvest low-potassium lettuce

which is known to be difficult to watering

Automated Identification Technology

Airbus selected Fujitsu RFID technology to deliver dependable management and traceability of aircraft parts

Page 9: Data Will Change Business, But Will It Really Change ICT?

8 Copyright 2014 FUJITSU LABORATORIES LTD.

Changes in Data-Intensive Computing

Structured Data

Data Warehouse

Data Mining

Question: How are AI and neural network different from the 80’s ?

Answer:

Increased computer power and algorithms

Increased amount of data for learning

Unstructured Data

Deep Learning

Neural Network

Page 10: Data Will Change Business, But Will It Really Change ICT?

9 Copyright 2014 FUJITSU LABORATORIES LTD.

Computer Chess to Shogi (Japanese chess)

Complexity of Game Chess: IBM Deep Blue

Deep Blue defeated Kasparov in 1997

Evaluated 200 million positions per second

Brute force searching was used

Game Game-tree complexity

Tic-tac-toe 10^5

Checkers 10^31

Reversi 10^58

Chess 10^123

Shogi 10^226

Go 10^360

Image Source: http://www-03.ibm.com/ibm/history/exhibits/vintage/vintage_4506VV1001.html

Deep Blue won by evaluating 200 million positions per second

Image Source: http://en.wikipedia.org/wiki/Shogi

IBM Deep BlueShogi (Japanese Chess)

Page 11: Data Will Change Business, But Will It Really Change ICT?

10 Copyright 2014 FUJITSU LABORATORIES LTD.

Computer Chess to Shogi (continued)

Shogi has the distinctive feature of reusing captured pieces

Deep Blue-style approach is not enough to defeat human champion

Top five strength computer programs used machine learning in 2014 World Computer Shogi Championship

Example : Nine Day Fever (4th in 2014 World Computer Shogi Championship)

30 billion games generated by computer versus computer

CPU processing time for machine learning is 3,000 times longer than total calculation of moves in one game

Learning Time: 1 month calculation on 12-node PC Cluster

Learning time is 3,000 times longer than total calculation of moves

Page 12: Data Will Change Business, But Will It Really Change ICT?

11 Copyright 2014 FUJITSU LABORATORIES LTD.

Applications of Data Processing

Importance of retrieving useful information from large and/or unstructured data

Location-aware recommendation

Real-time advertisement

Personalized video search

Machine learning and other knowledge computing

Need more and more computing powerfor data processing

Page 13: Data Will Change Business, But Will It Really Change ICT?

12 Copyright 2014 FUJITSU LABORATORIES LTD.

Data-Centric ICT Platform

Page 14: Data Will Change Business, But Will It Really Change ICT?

13 Copyright 2014 FUJITSU LABORATORIES LTD.

User Requirements for Data Processing

Velocity Variety Volume

• Rapid Decision Making• M2M• IoT

• Voice and Video Data• Social Media Data• Linked Open Data

ICT system capable of processing various kinds of data

Page 15: Data Will Change Business, But Will It Really Change ICT?

14 Copyright 2014 FUJITSU LABORATORIES LTD.

Problems of Traditional ICT Platform

Goal: Flexible Data-Centric ICT Platform

• One system is designed for a specific application

• Data characteristics are different between applications and it is difficult to apply one solution to anotherVideo sharing: write once/read many

Server Server

Storage

SAN

Middleware Middleware

Storage

SAN

Middleware

Server

Page 16: Data Will Change Business, But Will It Really Change ICT?

15 Copyright 2014 FUJITSU LABORATORIES LTD.

Data-Centric ICT Platform

ICT Platform that handles various types of data

Web, SNSLifelog

Sensordata

Open/Publicdata

Enterprisedata

Data-CentricICT Platform

MachineLearning

BigData

Knowledge Solutions

Application

Virtual Data Integration

Flexible Data Store SoftwareRDB DWH NoSQL File

Hardware Acceleration

Various types of data

Page 17: Data Will Change Business, But Will It Really Change ICT?

16 Copyright 2014 FUJITSU LABORATORIES LTD.

Technologies for Data-Centric ICT

Flexible Data Store Software For various kinds of data characteristics

• Low-Latency

• High-Throughput

• Reliability

• Faster Recovery time

Acceleration by Utilizing New Hardware

Flash-based storage

Many-core processor

Virtual Data Integration

Integration of different data stores

Page 18: Data Will Change Business, But Will It Really Change ICT?

17 Copyright 2014 FUJITSU LABORATORIES LTD.

Research Topics

Scalable Storage for High-Throughput Data

Shingled Erasure Coding for Fast Recovery from Multiple Disk Failures

Workload-Aware Flash Storage

Booth S1

Booth C22

Page 19: Data Will Change Business, But Will It Really Change ICT?

18 Copyright 2014 FUJITSU LABORATORIES LTD.

Research Topic 1:Scalable Storage for High-Throughput Data

Page 20: Data Will Change Business, But Will It Really Change ICT?

19 Copyright 2014 FUJITSU LABORATORIES LTD.

High-Throughput Data

Rapid traffic increase in SNS and Sensor Network

Challenges in capturing and storing speedy and voluminous streams of data

High-ThroughputNetwork

If we can accumulate data, useful business information can be analyzed from data

Store

Analyze

Packet Data

Page 21: Data Will Change Business, But Will It Really Change ICT?

20 Copyright 2014 FUJITSU LABORATORIES LTD.

Technological Challenges

- Data volume depends on the traffic• High-traffic: 10Gbps * 1day = 108 TB

- Storage system should scale-out easily

Real-Time Search

Scalable Storage

- Search at high-speed without affecting storage performance

High-Throughput Data Capturing

- 40Gbps- Short packets

Page 22: Data Will Change Business, But Will It Really Change ICT?

21 Copyright 2014 FUJITSU LABORATORIES LTD.

Technology Enabling High-Speed Search while Accumulating Data at 40-Gbps Technology Features

Captures and archives all packets in 40 Gbps network which is four times faster than conventional product

Hardware independent software technology

Cost-effective and easy-to-deploy

Customer Benefits

Helps to ensure data center security

Network forensics

• Intrusion detection

• Legal evidence

Stream-analysisunit

multiple commodity servers

Scalable accumulation unit

TAP

a single commodity server

Booth S1

Page 23: Data Will Change Business, But Will It Really Change ICT?

22 Copyright 2014 FUJITSU LABORATORIES LTD.

Press Release and Award

Press Release

April 14, 2014 Fujitsu Develops Technology Enabling High-Speed Search while Accumulating Data at 40-Gbps

Award

Interop Tokyo 2014

Best of Show Award: Grand Prix

http://www.fujitsu.com/global/about/resources/news/press-releases/2014/0414-02.html

Page 24: Data Will Change Business, But Will It Really Change ICT?

23 Copyright 2014 FUJITSU LABORATORIES LTD.

Research Topic 2:Shingled Erasure Coding for Fast Recovery from Multiple Disk Failures

Page 25: Data Will Change Business, But Will It Really Change ICT?

24 Copyright 2014 FUJITSU LABORATORIES LTD.

Importance of Protecting Contents Data

Digital media data, which plays a central role in web services, has been experiencing an explosive rate of growth

The crucial importance of data to these services leads to a number of methods to guard against digital media data loss

Copying data is one approach, but the cost of storage cannot be ignored

Page 26: Data Will Change Business, But Will It Really Change ICT?

25 Copyright 2014 FUJITSU LABORATORIES LTD.

Methods of Protecting Content Data

Triple-Redundant Copy

RAID Technology Long established method of protecting mission-critical data

Rather than storing copies of each piece of data, introducing "parity" - redundant pieces of data that summarize other data as part of a protection system -allows for the same degree of protection as triple-redundancy but with significantly less actual redundancy

Triple-redundant copying

Copy

Data

Protection

Copy

Copy

Data

Protection

Copy

Copy

Data

Protection

Copy

Copy

Data

Protection

Copy

Copy

Data

Protection

Copy

Copy

Data

Protection

Copu

RAID6

Protection

Data Data Data Data Data Data

Parity

Protection

Parity

Page 27: Data Will Change Business, But Will It Really Change ICT?

26 Copyright 2014 FUJITSU LABORATORIES LTD.

Our Shingled Erasure Code

Multilayered Parity-Protection

An erasure code only with local parity groups

When a disk fails, only the minimum combination of parity and data needed for recovery is used, which shortens recovery time

Parity Parity

Protection Protection Protection Protection

Data Data Data Data Data Data

ParityParity

Shingled Erasure Code

Image Source: wikipedia

Page 28: Data Will Change Business, But Will It Really Change ICT?

27 Copyright 2014 FUJITSU LABORATORIES LTD.

Features of Shingled Erasure Code

Multilayered parity-protection reduces data recovery time

The structure for the parity protection range can flexibly change with usage contexts

More than 20% faster data recovery compared to conventional RAID (4TB, 48HDDs)

2. Collect data/parity

4. Write

1. Replace failed disk

3. Recoveryprocessing

Data DataLostData

LostData

Data Data

ParityProtection

Parity

Parity

Protection

Protection

ParityProtection

Unused forrecovery

Unused forrecovery

Page 29: Data Will Change Business, But Will It Really Change ICT?

28 Copyright 2014 FUJITSU LABORATORIES LTD.

Press Release and USENIX HotDep 2014

Press Release

October 6, 2014 Fujitsu Laboratories Develops Fast Recovery Process for Multiple Disk Failures

USENIX HotDep

Ongoing discussion with Open Source Scalable Storage CephCommunity

Erasure Code with Shingled Local Parity Groups for Efficient Recovery from Multiple Disk Failures Takeshi Miyamae, Takanori Nakao, and Kensuke Shiozawa, Fujitsu Laboratories Ltd.

http://www.fujitsu.com/global/about/resources/news/press-releases/2014/1006-01.html

http://ceph.com/

Page 30: Data Will Change Business, But Will It Really Change ICT?

29 Copyright 2014 FUJITSU LABORATORIES LTD.

Research Topic 3:Workload-Aware Flash Storage

Page 31: Data Will Change Business, But Will It Really Change ICT?

30 Copyright 2014 FUJITSU LABORATORIES LTD.

Workload-Aware Flash Storage

Conventional Flash Storage Problems

Legacy SAS/SCSI API interface never takes advantage of flash storage features such as parallel reading

Our Workload-Aware Flash Storage

Moves flash controller into software layer

Total optimization including OS, driver, and software controller

New API takes advantage of flash features according to the workload

Customer Benefits

Faster storage performance for data-intensive computing

Application

File System

Block DeviceDriver

Controller

NAND FlashDevice

Conventional

Application

OptimizedSoftware

Layer

Proposed

Soft

war

eH

ardw

are

NAND FlashDevice

NAND FlashDevice

NAND FlashDevice

NAND FlashDevice NAND Flash

Device

NAND FlashDevice

NAND FlashDevice

NAND FlashDevice

NAND FlashDevice

Controller

Page 32: Data Will Change Business, But Will It Really Change ICT?

31 Copyright 2014 FUJITSU LABORATORIES LTD.

Workload-Aware Flash Storage Prototype

Optimized for SAP in-memory database systems

Please drop by Booth C22

One example of HANAmicro-benchmark shows

5 times speed-up byparallel reading

PCIe Flash Storage

Page 33: Data Will Change Business, But Will It Really Change ICT?

32 Copyright 2014 FUJITSU LABORATORIES LTD.

Summary

Page 34: Data Will Change Business, But Will It Really Change ICT?

33 Copyright 2014 FUJITSU LABORATORIES LTD.

Summary

Data-Driven Innovation

Data-Centric ICT Platform

Three Research Topics

Scalable Storage for High-Throughput Data

Shingled Erasure Coding for Fast Recovery from Multiple Disk Failures

Workload-Aware Flash Storage

Changes are needed towards a data-centric ICT platform

Page 35: Data Will Change Business, But Will It Really Change ICT?

34 Copyright 2014 FUJITSU LABORATORIES LTD.