mobile, the driving force for tomorrow - arm architecture · pdf filemobile, the driving force...
TRANSCRIPT
Title 44pt Title Case
Affiliations 24pt sentence case
20pt sentence case
© ARM 2016
Mobile, The Driving Force for Tomorrow - Technologies for Greener, Faster, and Smarter
ARM Tech Forum 2016 – Taipei
July 1st 2016
Ivan H.P. Lin
Sr. Segment Marketing Manager, ARM
© ARM 2016 2
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
3Bn ARM powered
smartphones in
use today
Smartphones: The World’s Primary Compute Device
The
smartphone
means so many
things to so
many people
© ARM 2016 3
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Consumers Love Their Smartphone
and Want More From Their Next Smartphone
Battery-
life
Connectivity Multi-
media
Build and
Design
Camera
Trust and
Security
Fluid user
experience
© ARM 2016 4
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Continuous Journey of Evolution and Revolution
Camera evolution,
capturing what you
see and want
Vulkan graphics API Computer vision
True wireless device
Never touches a wire
Cortex A72 big.LITTLE in
$200 smartphones
Security for your physical
and digital life
Path to 5G for
consistent connected
experience
Multiscreen support –
Desktop or TV
USB-C allowing you
to connect to all your
peripherals
Sustained compute for
virtual reality and
”AAA” Gaming
© ARM 2016 6
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
The “Mobile Envelope”
= fundamental constraints of a smartphone form factor
Power Budget
Thermal Budget
Memory Budget
© ARM 2016 7
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Mobile Envelope Factors
Mechanics
Reduction in Z height
Shrinking of the bezel
Choice of materials
Battery size
Camera
More sensors
More demanding sensors
Complex multi channel ISP’s
Use cases
Full 4K features
Increased photography compute
CV
Memory
Use cases driving B/W usage
Bandwidth vs power
Modem
Nx Carrier aggregation = N live radios
Physical RF challenges
Computational complexity increasing
Data usage increasing
And more
Increased sensing
PMIC
Video
© ARM 2016 8
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Evolution of The Premium Smartphone
Shrinking form factors
Increasing demand for
more efficient compute
More challenging heat dissipation
HTC
Nexus One
11.5mmç
New features
42% more screen
to body
since 2010
More
processing
2010 2012 2014 2016
Mate 8
7.9mm
Galaxy S3
8.6mm Lumia 1520
8.4mm
55%
screen to
body ratio
78%
screen to
body ratio
45% thinner
since 2010
© ARM 2016 9
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Evolving Challenges of Premium Smartphones
Innovation continues to drive performance
AR/VR, Ultra-HD visualization, Object-based audio processing
Thinner designs with longer battery life
Challenge: Thermal constraints and limited battery capacity
User experience is the ultimate metric
Standard benchmarks paint a partial picture
3D/360°
Video LTE Advanced
5G
AR/VR >20MP
Camera
4k/8k
Display Need for Sustained Performance with High Efficiency
© ARM 2016 10
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
How Did We Meet Consumer’s Expectations ?
24x Higher screen
resolution
5x Increase in
environmental
sensors
100x Higher compute
performance
20x Increase in
connectivity
300x Higher GPU
performance
Today’s high-end phone compared to 2009
In order for smartphone innovation to continue
ARM needs to continue to delivering “more for less”
© ARM 2016 11
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
ARM® Cortex® -A Portfolio
ARMv8-A ARMv7-A
Cortex-A9
Well established mid-range
processor used in many markets
Cortex-A5 Smallest and lowest power
ARMv7-A CPU, optimized for single-core
Cortex-A32
Smallest and lowest power
ARMv8-A
High
Performance
High
Efficiency
Ultra High
Efficiency
Cortex-A15
High-performance with infrastructure
feature set
Cortex-A17 High-performance with lower power and smaller area
relative to Cortex-A15
Cortex-A7 Most efficient
ARMv7-A CPU, higher
performance than Cortex-A5
Cortex-A57
Proven high-performance
Cortex-A73
2017 Premium Mobile,
Consumer
Cortex-A53
Balanced performance and
efficiency
big.LITTLE compatible Key:
Cortex-A35
Highest efficiency
Cortex-A72
2016 Premium Mobile, Enterprise & Auto
64/32-bit 64/32-bit
64/32-bit
64/32-bit
32-bit 64/32-bit
Cortex-A8
First ARMv7-A processor
© ARM 2016 12
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Thermal Throttling and the Impact on Performance
1550MHz
1229MHz
69°C
62°C
44
52
60
68
400
800
1200
1600
Tem
pera
ture
(°C
)
Fre
quency
(M
Hz)
big CPU Frequency LITTLE CPU Frequency SoC Temperature
-20% Frequency
impact
Normal Operation
Thermally Saturated
Measured on Cortex-A57/A53 device, Geekbench v3 ST
Device Temperature indicative only
Long-run thermal saturation impacts overall system performance
-25% System
Performance
Impact
Device
Temperature
© ARM 2016 13
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
0
1
2
3
4
5
6
0
1
2
3
4
5
6
More Performance, Less Power R
ela
tive
vs
Cort
ex-A
7
Power consumption measured in mW/unit performance
Performance measured as single thread at-speed
Cortex-A15 Cortex-A57 Cortex-A72 Cortex-A73
Accelerated growth in
Performance/Cycle
Delivered performance
Continuous reduction
in power consumption
and
Continuous
improvement
in power
efficiency
contributing
to a longer
battery life
Cortex-A15 Cortex-A57 Cortex-A72 Cortex-A73
© ARM 2016 15
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Cortex-A57
20nm @ 1.9GHz
Cortex-A72
16nm @ 2.5GHz
Cortex-A73
10nm @ 2.8GHz
Rela
tive
Perf
orm
ance
Cortex-A73: Maximizes Performance
Cortex-A73 sustains peak performance within per-core power budget
Sust
ained ≈
Peak
2.1x
1.3x Sust
ained
Peak
Sust
ained
Peak
2015 2016 2017
Cortex-A57 20nm @ 1.9GHz
Cortex-A72 16nm @ 2.5GHz
Cortex-A73 10nm @ 2.8GHz
Sustained operation within 750mW per-core power budget
As measured on Spec2k
© ARM 2016 16
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Cortex-A73: Power Efficiency Benefit
0.0x
0.5x
1.0x
Integer Floating Point L2 Cache Memory Copy
Power Relative to Cortex-A72
(ISO-process, ISO-frequency)
Cortex-A72 Cortex-A73
Giving additional thermal
headroom for the rest of the SOC
>20% Power Saved
© ARM 2016 18
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
1.0x
1.3x
1.9x
Octa-core
Cortex-A53 (4L+4L)
Hexa-core
Cortex-A73+Cortex-A53 (2b+4L)
1.0x 1.0x 1.0x
Cortex-A73: Boost your Mid-range Smartphone
Enhanced user experience
Best response time
≈
+30% Multi-core
Performance
+90% Single-thread
Performance
Implemented on the same process technology
Same area footprint
More performance, same footprint
© ARM 2016 19
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Architecturally Identical Processors
High Performance ‘big’ and High Efficiency ‘LITTLE’ cores
big CPUs deliver higher single thread performance
LITTLE CPUs deliver higher energy efficiency
Seamless & Automatic Task Allocation
Assigns low intensity tasks to the LITTLE cores and high intensity
tasks to the big cores
Thread migration is typically faster than DVFS
Heterogeneous Computing
Efficient handling of single and multi-threaded applications
Scalable to multiple clusters with Cache Coherent Interconnect
big.LITTLE Technology
L2 Cache L2 Cache
Cache Coherent Interconnect
Interrupt Control
big Cluster LITTLE
Cluster
1 2
Relative big. LITTLE Power*
Cortex-A57
Cortex-A53
Cortex-A15
Cortex-A7
Further
power
reductions
* Measured average power across high and low intensity workloads on big.LITTLE-based smartphones
© ARM 2016 21
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
big.LITTLE MP
Delivers higher power efficiency
Extends battery life
Improves user experience
0%
20%
40%
60%
80%
100%
120%
140%
160%
180%
0%
20%
40%
60%
80%
100%
120%
140%
160%
180%
Cluster Migration
big.LITTLE MP
Power
Measured Power and Performance on big.LITTLE Devices
(big.LITTLE MP relative to Cluster Migration)
-29% -38% +20%
+60%
Performance
Web
Browsing
Intensive
Gaming
Web Browsing
Intensive
Gaming (Lower is Better) (Higher is Better)
© ARM 2016 22
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
High
Mid
Low
WFI / Power Down
Category 2
Sustained Performance
at Thermal Limit
Category 3
Long-use Low-Intensity
Workloads
Category 1
Burst High Intensity Workloads
Measured on a Quad Cortex-A7 Symmetric Multiprocessing platform
Applications require a mix of performance levels
Mobile users want a better user experience but not at a cost of reduced battery life
Example: Web Browsing
Example: Castlemaster
Example: Audio Playback
Mobile Application Workloads Perc
enta
ge o
f Tim
e S
pent
in
DV
FS S
tate
s
© ARM 2016 23
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
SoC
ARM Intelligent Power Allocation (IPA)
Tdie
Tskin Power to Heat
SoC
Device
IPA
Elements:
Proactive temperature control
big LITTLE GPU
Real time CPU & GPU
Real-time CPU & GPU
Performance requests
requests Performance Requests
Power estimation
Dynamic power allocation
big LITTLE GPU
Dynamic Allocation by:
Performance required
Thermal headroom ation by:
•Performance required
•Thermal headroom
Allocated Performance
© ARM 2016 24
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Intelligent Power Allocation in Action
Device temperature is below threshold
There are no constraints on power / performance
Every actor runs at max required frequency
Median filtered chart for clarity
Runnin
g
Fre
quency
Three consecutive runs of GLB TRex
Time
Max “big” freq
“big” running freq
Max “LITTLE” freq
“LITTLE” running freq
Max GPU freq
GPU running freq
© ARM 2016 25
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Intelligent Power Allocation in Action
High load on GPU & low load on CPU
GPU gets allocated most of the power
Median filtered chart for clarity
Runnin
g
Fre
quency
Three consecutive runs of GLB TRex
Time
Max “big” freq
“big” running freq
Max “LITTLE” freq
“LITTLE” running freq
Max GPU freq
GPU running freq
© ARM 2016 26
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Runnin
g
Fre
quency
Three consecutive runs of GLB TRex
Time
Max “big” freq
“big” running freq
Max “LITTLE” freq
“LITTLE” running freq
Max GPU freq
GPU running freq
Median filtered chart for clarity
Intelligent Power Allocation in Action
High load on CPU & low load on GPU
CPU gets allocated most of the power
© ARM 2016 27
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Runnin
g
Fre
quency
Three consecutive runs of GLB TRex
Time
Max “big” freq
“big” running freq
Max “LITTLE” freq
“LITTLE” running freq
Max GPU freq
GPU running freq
Median filtered chart for clarity
Intelligent Power Allocation in Action
Device temperature gets hotter
IPA reduces available power to actors
This maintains temperature control
© ARM 2016 28
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Intelligent Power Allocation in Action
0
10
20
30
40
1st Run 2nd Run 3rd Run Average
IPA
vs. T
radit
ional
(Rela
tive
Perf
orm
ance
)
13% Improvement
34% Improvement
36% Improvement
28% Improvement
Median filtered chart for clarity
Runnin
g
Fre
quency
Three consecutive runs of GLB TRex
Time
Max “big” freq
“big” running freq
Max “LITTLE” freq
“LITTLE” running freq
Max GPU freq
GPU running freq
© ARM 2016 30
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
EAS – Energy Aware Scheduler
Compute capacity
(Performance)
Capacity Utilization
0 1 2 3
Max capacity big
Waking task
?
CPU
cluster little big
Current
capacity
Max capacity little
Placing task on cpu1:
P-state change for
cpu0 and cpu1.
Placing task on cpu3:
No P-state changes.
CPU 0, 1
CPU 2, 3
Little
core
Big core
Power
CONFIDENTIAL 31
LTE Global Market Through 2020
Source: GSMA
1.3 Billion LTE Connections by end of 2016
Strong WW growth of LTE expected, establishing it as the dominant access technology
FDD is prevalent, but TDD growing to 25% share by 2020
Huge backlog of new features in 3GPP, very strong LTE roadmap
© ARM 2016 32
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
The Demand for Faster Connectivity
© ARM 2016 33
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Deployment
Deployment
5G Modem Development Timeline
5G Goals
Ultra reliable, low latency
Large scale IoT deployment
Enhanced mobile broadband
Showing specification freeze to first deployment
2014 2015 2016 2017 2018 2019 2020 2021 2022
LTE Advanced
Development
LTE Advanced
Pro
Development
5G wave 2
Development
5G wave 1
Development
Modem design
starts in 2016
600 Mbps
> 10 Gbps Deployment
Deployment
2023 2024
© ARM 2016 34
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
3GPP 4G Evolution…. the path to 5G
Higher levels of carrier aggregation leading to much higher aggregate throughputs
Complex MIMO and higher order modulation to increase spectral efficiencies
2008 2009 2010 2011 2012 2013 2014 2015 2016 2017 2018
Release 8 Release 9 Release 10 Release 11 Release 12 Release 13 Release 14
LTE LTE-Advanced LTE-A… the path to 5G
300 Mbps 450 Mbps 600 Mbps 1Gbps +
OFDMA / MIMO 256 QAM 5G mm Wave
Small Cell/ Femto Cell/Hetnet Single Cell Point to Multi-Point
2CA 21 new combinations up
to 2 CA
114 new combinations
up to 3CA
52 new (190 total)
combinations up to 4CA Up to 32CA in discussion
Multimedia Broadcast/Multicast Services (MBMS)
LTE-Unlicensed (Licensed Assisted Access) / Wi-Fi co-existence (LTE-H)
LTE-MTC Enhancements to LTE-MTC
Device to Device (D2D) D2D enhancements inc push to talk
2D Antenna Array MIMO Interference cancellation
Indoor Positioning Location Services
© ARM 2016 35
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
‘LTE-Advanced Pro’ – The Stepping Stone To 5G LTE
Unlicensed
Carriers are aggregated for increased throughput
WiFi LTE Carriers 5G Carriers
LTE
Unlicensed WiFi LTE Carriers LTE Carriers
LTE
Advanced LTE Advanced-Pro
5G
50x throughput*
From LTE to 5G
* ARM Estimate, LTE Rel 8 to 5G wave 2
© ARM 2016 36
CONFIDENTIAL
CONFIDENTIAL
CONFIDENTIAL
Supporting All Connectivity Standards
ARM cores enable all major wireless connectivity standards from Bluetooth-LE through LTE-A towards 5G
Cortex-R8 processor extends support to new performance points to Gbps LTE-A & 5G
802.11ah ‘HaLow’, LTE Cat-M and NB-IOT brings new ultra low power requirements to wide area networks
11a/b
11 g
11 n
11 ac
11 ad
10Mbps
100Mbps
1Gbps
10Gbps
1Mbps
100Kbps
ARM7TDMI®
Cortex-M0
Cortex-M3
Cortex-R4
Cortex-R5
Cortex-R4
Cortex-R5
Cortex-M7
3.5G
LTE-M &
NB-IOT
3G
3.9G
4G
ARM11 ™
Cortex-R4
Cortex-M
ARM9™
ARM11™
Cortex-R4
Cortex-R5
Cortex-R7
Established
Emerging
Cortex-M0+
Cortex-M3
Cortex-R8 4.5G & 5G
802.11ah
BT4.0
BT-LE
© ARM 2016 37
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Sensor Hubs ..... Everywhere
With the explosion of sensors in devices coupled with always-on operation we are seeing an explosion of
sensor hub applications
Sensor Hub is rapidly establishing as a platform for innovation and differentiation
Healthcare Mobile
Tablet
Wearables
Fitness
TV
Sensor Hubs Enable....
Always on sensor monitoring
‘smart’ data fusion
Combined multi-data pattern recognition
New use cases such as contextual awareness
© ARM 2016 38
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Always Sensing, Always On, Always Aware
API Application
Processor
Always Aware
Temperature
GPS
Magnetometer
Accelerometer
Gyroscope
Touch
Microphone
Luminance
Cortex-M
Sensor Hub
Smart Analog Sensor Fusion
© ARM 2016 39
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Sensor Hub Architecture
© ARM 2016 40
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Application
Processor
GSM / CDMA
Transceiver
GPS
Receiver
Touch Screen
Controller
Audio
Processor
PMIC
Mic
Sensor Hub
Image
Processor
Acc Gyr Mag
Cam
NFC
Transceiver
WiFi
Bluetooth
Bar ALS
Touch
Prox
24/7 in mobile – sensor hub
Wearable
Benefits:
• Ultra Low Power Consumption
• Always On Capability
Advanced Context
Awareness
Activity Monitoring
Pedestrian Dead Reckoning
Power Control, Etc …
Distributed Sensor Hub Bus will increase sensor fusion
innovation and the need for more intelligence at the sensor level
© ARM 2016 41
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
More Intelligence – The New Paradigm Shift
Speech recognition
Home security
Machine Translation
Face and Object Recognition
© ARM 2016 42
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Smartphone: The Portal to Future Innovation
Machine
Learning &
Computer
Vision
Consumer
IOT / Smart
home
Virtual and
Augmented
Reality
Always-on
Digital
Assistants
© ARM 2016 43
Title 40pt Title Case
Bullets 24pt sentence case
Sub-bullets 20pt sentence case
Summary
Premium mobile user experience demands
Thinner devices + Lower power budgets + More performance
ARM Cortex-A73 delivers sustained peak performance in the mobile power budget
30% more performance, 30% more power efficiency
Delivers increased performance range for premium mobile and large screen devices
Cortex-A73 with big.LITTLE delivers improved scalable performance
Boost for the mid-range smartphone and a broad range of consumer applications
LTE-A and LTE-A Pro Paving the Path Toward 5G
Cortex-R8 Extends ARM’s Real-time Leadership for Future Connectivity
Connected Intelligence the Trends for Future Mobile
MEMS Sensor and Sensor Fusion the Fundamental Building Blocks for Contexture Awareness
Cortex-M provides saleable and first-in-class solution
ARM Provides Comprehensive System Solution for Future Mobiles