arm processor technology update and roadmap€¦ · cavium corporate overview multi-core mips, arm...

16
Arm Processor Technology Update and Roadmap

Upload: others

Post on 07-Oct-2020

5 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

Arm Processor Technology Update and Roadmap

Page 2: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2016Cavium,Inc.– ConfidentialandProprietaryInformation

ARM Processor Technology Update and Roadmap

§ Cavium:GiriChukkapalliisaDistinguishedEngineerintheDataCenterGroup(DCG)

Introduction to ARM Architecture for HPC deployment and rationale for the design pointof ThunderX2 Core and SOC in the single thread performance vs throughput space ispresented in this talk. Focus is on the sustained performance per TCO and ease ofexploiting spectral parallelism of HPC applications. Preliminary experience of porting,running and performance analysis of HPC applications will be discussed.

Page 3: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

ThunderX2 in HPC

Page 4: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation4

Cavium Corporate Overview

Multi-Core MIPS, ARM Processors, Security, SDN Switch andServer/Storage Connectivity ~$10B TAM

Mobile InfrastructureEnterprise Data Center and Cloud Service Provider Cloud

Page 5: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

MostWidelyUsed

LicensingModel

ü Over90Bshippedin25yrsü Outshipsx86by20Xper

year

ü Anyonecanbuildü Innovate&Optimize

fortargetedapplications

ARM Servers & ARM for HPCARMforHPC

§ ARM=Choice&pathtomoreoptimizedsolutions

§ MarchtoExascale openingdoorfornewISA

• MassiveparallelismrequiresSWchanges

• ARMHPCprojectsactiveworldwide

• HPChaslargeopensourcecomponent

• ThrivingARMecosystemforHPC

Page 6: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

Cavium’s Proven Leadership in Silicon Design

Highestperformance,mostwidelysupported,dualsocketARMv8serversinproduction

2coreto48core,varietyofpricepoints,TDP

CommonSWarchitecture

THECPUcompanyforInfrastructure

MulticoreCPUExpert

Performance2SConfig

HighPerfCustomCores

CompletePortfolio

ARMv8architecturallicensee

Power,Perf,AreaOptimized

#1inSecurity&WirelessInfrastructure,#2inEmbedded

OPTIMIZINGARM64SERVERSFORHPC&CLOUDDATACENTER

Page 7: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

§ Multithreaded,fullyoutoforderhighperformanceARMv8customcores

§ Singleanddualsocketsupport§ Highestmemorybandwidth&capacity§ Serverclassvirtualization§ ServerclassRAS

§ Extensivepowermanagement§ RichIOconfigurations§ ExtensivePowermanagement§ CoreandSocketlevelperformancecompetitivewith

nextgenincumbentserverCPUs§ Comprehensivehardwareandsoftwareecosystem

ARM Leadership –ThunderX2 FIRSTS for ARM Processors

® World’s Highest Performance Xeon Class ARM Server – 2nd generation product from Cavium

7

Page 8: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

Differentiation

PCIeLanes

MemoryCapacity

MemoryBandwidth

TotalThreads

Cores Highercorecountdelivershigherthroughput

Higherthreadcount=largernumberofvCPUs

More memorybandwidthformemoryintensiveworkloads

Morememorycapacityforin-memoryworkloads

RichIOconnectivityoptionsDirectattachtoVMe devices

IncumbentServerCPUThunderX2

Page 9: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

:ThrivingHPCEcosystem

StandardsBased SysManagement&FW

IndustryLeadingOperatingSystems

Linux EnterpriseSLE12

OptimizedCompilers&DevEnvironments

Debuggers,Profilers&ClusterMgt

OpenSource&CommunityFocus

Page 10: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

ThunderX Momentum in HPC Continues to Grow…

Mem

ory

Band

width

Integer

Floatin

gPo

int

Vectors

1.0

2.2X2.5X

3X4X

2-4X better HPC performance

SignificantHPCEngagementsEarlyPressAnnouncements

ServerplatformsatWorld’spremierHPCLabs

Page 11: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

Early Performance for HPC Applications

Page 12: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation

ThunderX2 Delivers Compelling Memory Bandwidth

Details: ThunderX2CPULinuxkernel4.8.0-32-generic(4kpages)StreamcompiledwithGCCversion5.4.0-6Ubuntu16.04.4at-O3

0102030405060708090

100

%ofp

eakband

width

%ofcpuloadload

StreamScaling

copy scale add triad

HighestMemorybandwidthenablesmemoryboundapplicationstoscalebetter

Page 13: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation13

OpenBLAS DGEMM

Details: ThunderX2CPULinuxkernel4.8.0-32-generic(4kpages)OpenBLAS compilewithGCCversion5.4.0-6ubuntu1~16.04.4at-O3

Efficientcorescapableofachievingclosetotheoreticalpeakperformance

Page 14: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation14

OpenBLAS SGEMM

Details: ThunderX2CPULinuxkernel4.8.0-32-generic(4kpages)OpenBLAS compilewithGCCversion5.4.0-6ubuntu1~16.04.4at-O3

Efficientcorescapableofachievingclosetotheoreticalpeakperformance

Efficientcorescapableofachievingclosetotheoreticalpeakperformance

Page 15: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation15

ThunderX2 Delivers Best-In-Class HPL Performance

Details: ThunderX2CPULinuxkernel4.8.0-32-generic(4kpages)HPLcompilewithGCCversion5.4.0-6ubuntu1~16.04.4(defaults)mpich v3.2,noopenmp,singlesockettest,processgridforeachtestcasebasedonnumberofcoresintest

Efficientcorescapableofachievingclosetotheoreticalpeakperformance

0

20

40

60

80

100

3 13 19 25 28 38 50 63 75 78 94 100

%ofp

eakpe

rform

ance

%ofsystemload

HPLscaling- %ofpeakGflops

Page 16: Arm Processor Technology Update and Roadmap€¦ · Cavium Corporate Overview Multi-Core MIPS, ARM Processors, Security, SDN Switch and ... §ARM = Choice & path to more optimized

©2017Cavium,Inc.– ConfidentialandProprietaryInformation16

ThunderX2 Performance Scaling on Real Applications

Highmemorythroughputbenefitssimplesimulations

Largehighperformancecorecountcombinedwithhighmemorythroughputbenefitscomplexsimulations