ibm power7
DESCRIPTION
IBM Power 7 processor technology with focus on the IBM i.TRANSCRIPT
Tom Presotto - October 2010
A long walk … together
S/38 August 79
IBM Vimercate
S/38 manufacturing plant.
February 79
Back to the basis
1990 – 2010
Windows
client-server
model
1979 – Mainframe and
“dumb” terminals
2010 – Cloud Computing
and browser
SOA & Cloud
4
POWER8
POWER4™180 nm
POWER5™130 nm
POWER6™65 nm
POWER745 nm
POWER5 POWER5+™ POWER6 POWER7
Technology 130 nm 90 nm 60 nm 45 nm
Size 389 mm2 245 mm2 341 mm2 567 mm2
Transistors 276 M 276 M 790 M 1.2 B
Cores 2 2 2 4 / 6 / 8
Frequencies 1.65 GHz 1.9 GHz 3-5 GHz 3-4 GHz
L2 Cache 1.9 MB Shared 1.9 MB Shared 4 MB / Core 256 KB / Core
L3 Cache 36 MB 36 MB 32 MB 4 MB / Core
Memory Cntrl 1 1 2 / 1 2 / 1
LPAR 10 / Core 10 / Core 10 / Core 10 / Core
20012004
20072010
POWER4™
p670
1.1 GHz
rPerf: 24.46
KWatts: 6.71
3.64
POWER4+™
p670
1.5 GHz
rPerf: 46.79
KWatts: 6.71
6.97
POWER5™
p5-570
1.65 GHz
rPerf: 68.4
KWatts: 5.2
13.15
POWER5+™
p570
1.9 GHz
rPerf: 85.20
KWatts: 5.2
16.38
POWER6™
Power 570
4.7 GHz
rPerf: 134.35
KWatts: 5.6
23.99
POWER6™
Power 570
4.2 GHz
rPerf: 193.25
KWatts: 5.6
34.56
POWER7™
Power 780
3.8 GHz
rPerf: 685.09
KWatts: 6.4
107.04
Performance per Watt
POWER7 Core
6
Transistors: 1.2 B
Binary Compatibility with POWER6
• 4 Way SMT per core – up to 4 threads per core
• 32 Threads per chip
• L1: 32 KB I Cache / 32 KB D Cache
• L2: 256 KB per core
• L3: Shared 32MB on chip eDRAM
Eight processor
cores
POWER7 Core
7
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
L3 Cache and
Chip Interconnect
MC1MC0
12 Execution Units
• 2 Fixed Point Units
• 2 Load Store Units
• 4 Double Precision Floating Point Units
• 1 Branch
• 1 Condition Register
• 1 Vector Unit
• 1 Decimal Floating Point Unit
64-bit PowerPC architecture v2.07
Modes: POWER6, POWER6+ and POWER7
L3 Cache
eDRAM
Fast Local Regions
Shared L3 Cache
Intelligent cache management
On chip cache
benefits
6-to-1 latency
improvement
2 x bandwidth
improvement
No off-chip drv & rcv
20% energy of SRAM
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
POWER7
CORE
L2 Cache
L3 Cache and
Chip Interconnect
MC1MC0
Local SMP Links
Remote SMP & I/O Links
FAST
L3 REGION
POWER6
Memory+
GX+ Bridge
Memory+
GX Bus Cntrl Mem
ory
Cn
trl
Mem
ory
Cn
trl
Fabric BusController
CoreAlti
Vec
L3Ctrl
L3L3Ctrl
L3
CoreAlti
Vec
4 MB L2
4 MB L2
POWER7
Core
L2
Core
L2
Memory Interface
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
GX
SMP
FABRIC
POWER
BUS
Memory++
L3 Cache
POWER7 delivers up to 3 - 4X the performance with less energy than
POWER6
10
DDR2 @ 553 MHz
POWER5
Effective Bandwidth:
1.1 GB/s
DDR2 @ 553 / 667 MHz
POWER6
DDR3 @ 1066 MHz
POWER7
DDR3
DDR3
DDR3
DDR3
DDR3
DDR3
DDR3
DDR3
DDR3
DDR3
Effective Bandwidth:
2.6 GB/s Effective Bandwidth:
6.4 GB/s
2 memory controller
Up to 256GB of memory
11
EDRAM Cell
Greater density
Less power requirements
Fewer soft errors
Better performance
Enables POWER7 to provide 32MB of internal L3 Cache
1/3 the space of conventional 6T SRAM implementation
1/5 the standby power
1.5 Billion reduction in transistors
12
Core
L2
Core
L2
Memory Interface
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
GX
S
M
P
F
A
B
R
I
C
P
O
W
E
R
B
U
S
32 MB L3 Cache
POWER7 TurboCore™ Mode TurboCore Chips: 4 available cores
Aggregation of L3 Caches of unused cores.
TurboCore chips have a 2X the L3 Cache per Chip available
4 TurboCore Chips L3 = 32 MB
Provides up to 1.5X per core to core
Chips run at higher frequency
Power reduction of unused cores.
With “Reboot”, System can be reconfigured to 8 core mode.
Unused
CoreTurboCores
Power 780 TurboCore
13
Core
L2
Core
L2
Memory Interface
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
GX
S
M
P
F
A
B
R
I
C
P
O
W
E
R
B
U
S
24 MB L3 Cache
POWER7 Core / Cache options 6-Core Chip
Power 750 / Power 770
Core
L2
Core
L2
Memory Interface
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
Core
L2
GX
S
M
P
F
A
B
R
I
C
P
O
W
E
R
B
U
S
16 MB L3 Cache
4-Core Chip
PS700
14
Multi-threading Evolution
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
Single thread Out of Order
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
S80 Hardware Multi-thread
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
POWER5 2 Way SMT
FX0
FX1
FP0
FP1
LS0
LS1
BRX
CRL
POWER7 4 Way SMT
Thread 1 ExecutingThread 0 ExecutingNo Thread Executing
Thread 3 ExecutingThread 2 Executing
15
POWER7 4 way SMT
Standard Cache Option
All cores active
• POWER6 Mode supports SMT1 and SMT2
Requires POWER7
Mode
• AIX 6.1 and AIX 7.1
• IBM i 6.1 and 7.1
• Linux
Operating System Support
• Spread work among cores to execute in appropriate threaded mode
• Can dynamical shift between modes as required: SMT1 / SMT2 / SMT4
Dynamic Runtime SMT scheduling
0
0,5
1
1,5
2
SMT1 SMT2 SMT4
Active Memory Expansion
Expand memory beyond physical limits
More effective server consolidation
• Run more application workload / users per partition
• Run more partitions and more workload per server
Expanded
MemoryTrue
Memory
Effectively up to 100% more memory
True
Memory
True
Memory
True
Memory
True
Memory
True
Memory
Expanded
Memory
Expanded
Memory
Expanded
Memory
Expanded
Memory
Expanded
Memory
Active Memory Sharing
Moves memory from one partition
to another
AXI, IBM i, and Linux
partitions0
5
10
15
Night
Day
0
5
10
15
Asia
Americas
Europe
Time
Time
Me
mo
ry U
sa
ge
(G
B)
Mem
ory
Usage (
GB
)0
5
10
15#10
#9
#8
#7
#6
#5
#4
#3
#2
#1 Time
Mem
ory
Usage (
GB
)
Around the World
Day and Night
Infrequent Use
18
EnergyScale™
IBM Systems Director is also required to manage AEM functions and supports the following functions:
• Power Trending
• Thermal Reporting
• Static Energy Saver Mode
• Dynamic Energy Saver Mode
• Energy Capping
• Soft Energy Capping
• Processor Nap
• Energy Optimized Fan Control
• Altitude Input
• Processor Folding
18
EnergyScale is used to dynamically optimizes the processor performance versus processor power and system workload
TPMD:
Thermal Power Management Device
19
TPMD card is part of the base hardware configuration.
TPMD function is comprised of a risk processor and data acquisition
TPMD monitor power usage and temperatures in real time
Can adjust the processor power and performance in real time
If the temperature exceeds an upper (functional) threshold, TPMD actively reduces power consumption by reducing processor voltage and frequency or throttling memory as needed.
20
POWER7 “Over Clock” Uplift
3
3,2
3,4
3,6
3,8
4
4,2
4,4
Nominal Over Clock
If the temperature is lower than upper (functional) threshold, TPMD will allows POWER7 cores to “Over clock” if workloads demands are present.
Offerings - April 2010
Power 750 Express
Power 755
Power 770
Power 780
PS700 Express
PS701 Express
PS702 Express
Power 770
Power 750
Power 780
PS Blades
Power 755
Power 720/740
Power 710/730
HPC
Power 795
New modelsAugust 2010
Power 710 4-6-8 core
1 socket Processor module – pick ONE
4-core: 3 GHz 6-core: 3.7 GHz 8-core: 3.55 GHz
For 4-core (1 socket)
– Zero 12X I/O loops
– Max 1 4X IB Adapter
– Max 64 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P05 tier (users)
– AIX small tier
2U
For 6-/8-core (1 socket)
– Zero 12X I/O loops
– Max 1 4X IB Adapter
– Max 64 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P10 tier (users)
– AIX small tier
Power 730 8-12-16 core
2 sockets Processor module – pick
TWO of the same feature 4-core: 3 GHz 4-core: 3.7 GHz 6-core: 3.7 GHz 8-core: 3.55 GHz
For 8-/12-/16-core
– Zero 12X I/O loops
– Max 2 4X IB Adapters
– Max 128 GB memory
– Disk-only drawers ok
– Fibre Channel cards ok
– IBM i P20 tier (5250)
– AIX small tier
2U
Power 720 4-6-8 core
1 socket Processor module – pick ONE
4-core: 3 GHz 6-core: 3 GHz 8-core: 3 GHz
For 4-core
– Zero 12X I/O loops
– Max 64 GB memory
– Zero disk-only drawers
– Fibre Channel cards ok
– IBM i P05 tier (users)
– AIX small tier
For 6-/8-core
– Max 1 12X I/O loop
– Max 128 GB memory
– Disk-only drawers
– Fibre Channel cards ok
– IBM i P10 tier (users)
– AIX small tier
4U
Power 740 4-6-8-12-16 core
1 or 2 sockets
Pick processor modules 1 or 2: 4-core: 3.3 GHz
1 or 2: 4-core: 3.7 GHz
1 or 2: 6-core: 3.7 GHz
2: 8-core: 3.55 GHz
For 4-,6-core (1 socket)
– Max 1 12X I/O loop
– Max 128 GB memory
– IBM i P20 tier
5250 Entitlements
– AIX small tier
For 8-,12-,16-core (2 socket)
– Max 2 12X I/O loops
– Max 256 GB memory
– IBM i P20 tier
5250 Entitlements
– AIX small tier
4U
POWER7 delivers outstanding performance
Single core CPW
NB CPW measured in maximum system and I/O configuration
3800 4700 5950
525
POWER5
550
POWER5
520
POWER6
550
POWER6
720
POWER7
740
POWER7
20,000
40,000
60,000
80,000
CPW
Power 750
28
4 Socket 4U
6 or 8 cores per socket
3.0 to 3.55 GHz
Energy-Star Qualified
29
Power 750 System Overview
8 SFF Bays(Disk or SSD)
Dual Power Supplies
Half-High Bay (tape or removable disk
Up to 4 Processor / Memory
Cards
3 PCIe & 2 PCI-X Slots
Fans
TPMD
DVD
30
Power 750 System
POWER7 Architecture6 Cores @ 3.3 GHz8 Cores @ 3. 0, 3.3, 3.55 GHzMax: 4 Sockets
DDR3 Memory Up to 512 GB
System Unit SAS SFF BaysUp to 8 Drives (HDD or SSD)73 / 146 / 300GB @ 15k (2.4 TB)(Opt: cache & RAID-5/6)
System UnitIO Expansion Slots
PCIe x8: 3 Slots (2 shared)PCI-X DDR: 2 Slots 1 GX+ & Opt 1 GX++ 12X cards
Integrated SAS / SATA Yes
System UnitIntegrated Ports
3 USB, 2 Serial, 2 HMC
Integrated Virtual EthernetQuad 10/100/1000 Optional: Dual 10 Gb
System Unit Media Bays 1 Slim-line DVD & 1 Half Height
IO Drawers w/ PCI slots PCIe = 4 Max: PCI-X = 8 MAX
Cluster 12X SDR / DDR (IB technology)
Redundant Power andCooling
Yes (AC or DC Power)Single phase 240 VAC or -48 VDC
Certification (SoD) NEBS / ETSI for harsh environments
EnergyScaleActive Thermal Power ManagementDynamic Energy Save & Capping
Power 755
31
Power 755 and Power 750 hardware is very, very similar, but Power 755 offering is customized to High Performance Computing environment
32
Power 755POWER7 Architecture
4 Processor Sockets = 32 Cores8 Core @ 3.3 GHz
DDR3 Memory 128 GB / 256 GB, 32 DIMM Slots
System UnitSAS SFF Bays
Up to 8 disk or SSD 73 / 146 / 300GB @ 15K (up to 2.4TB)
System UnitExpansion
PCIe x8: 3 Slots (1 shared)PCI-X DDR: 2 Slots GX++ Bus
Integrated Ports 3 USB, 2 Serial, 2 HMC
Integrated EthernetQuad 1Gb Copper(Opt: Dual 10Gb Copper or Fiber)
System Unit Media Bay 1 DVD-RAM ( No supported tape bay )
ClusterUp to 64 nodesEthernet or IB-DDR
Redundant PowerYes (AC or DC Power)Single phase 240vac or -48 VDC
Certifications (SoD) NEBS / ETSI for harsh environments
EnergyScaleActive Thermal Power ManagementDynamic Energy Save & Capping
The highest performing 4-socket
system on the planet
33
POWER7 continues to break the rules with more performance
SPECint_rate
Itanium
HP rx6600
SPARC
Sun T5440
x86
HP DL585
POWER7
Power 755
with PowerVM
The most energy efficient 4-socket
system on the planet
34
Performance Per Watt
Most energy efficient systems
Itanium
HP rx6600
SPARC Sun
T5440
x86
HP DL585
POWER7
Power 755
with PowerVM
Power 770
35
12 or 16 core 4U Nodes
Up to 4 Nodes per system
3.1 and 3.5 GHz
Capacity on Demand
Enterprise RAS
36
Power 770Power 770
Processor Technology6 Cores @ 3.55 GHz8 Cores @ 3.1 GHz
L3 Cache On Chip
Redundant Power & Cooling Yes
Redundant Server Processor Yes / Two Enclosure minimum
Redundant Clock Yes / Two Enclosure minimum
Hot Add Support Yes
Hot Service Yes
System Unit Single Enclosure 4 Enclosures
Processors Up to 2 Sockets 8 Sockets
DDR3 Memory (Buffered) Up to 512 GB Up to 2 TB
SAS/SSD SFF Bays 6 24
DVD-RAM Media Bays 1 Slim-line 4 Slim-line
SAS / SATA Controller 2 / 1 8 / 4
PCIe bays 6 PCIe 24 PCIe
GX++ Slots (12X DDR) 2 8
Integrated EthernetStd: Quad 1Gb
Opt: Dual 10Gb + Dual 1 Gb
Std: Four Quad 1Gb
Opt: Four x Dual 10Gb + Dual 1 Gb
USB 3 12
12X I/O Drawers w/ PCI slots Max: 4 PCIe, 8 PCI-X Max: 16 PCIe, 32 PCI-X
4U x 32 inches Depth
Power 780
37
New Modular High-End
Up to 64 Cores
TurboCore
3.86 or 4.14 GHz
Capacity on Demand
Enterprise RAS
24x7 Warranty
PowerCare
38
Power 780Power 780
Processor Technology4 Cores @ 4.1 GHz TurboCore8 Cores @ 3.8 GHz
L3 Cache On Chip
Redundant Power & Cooling Yes
Redundant Server Processor Yes / Two Enclosure minimum
Redundant Clock Yes / Two Enclosure minimum
Hot Add Support Yes
Hot Service Yes
System Unit Single Enclosure 4 Enclosures
Processors 2 Sockets 8 Sockets
DDR3 Memory (Buffered) Up to 512 GB Up to 2 TB
SAS/SSD SFF Bays (CEC) 6 24
DVD-RAM Media Bays 1 Slim-line 4 Slim-line
SAS / SATA Controller 2 / 1 8 / 4
PCIe (CEC) 6 PCIe 24 PCIe
GX++ Slots (12X DDR) 2 8
Integrated EthernetStd: Quad 1Gb
Opt: Dual 10Gb + Dual 1 Gb
Std: Four Quad 1Gb
Opt: Four x Dual 10Gb + Dual 1 Gb
USB 3 12
12X I/O Drawers w/ PCI slots Max: 4 PCIe, 8 PCI-X Max: 16 PCIe, 32 PCI-X
Power 795✓New High-end
✓24 to 256 Cores
✓8 TB memory
✓TurboCore
✓3.7, 4.0 or 4.25 GHz
✓1,000 VMs* with PowerVM
✓Capacity on Demand
✓Enterprise RAS
✓24x7 Warranty
✓PowerCare
On October 7, IBM published a new SAP 2-tier Sales
and Distribution benchmark result on the
Power 795. The result is 70,032 users on a 128-core
Power 795 running AIX and DB2. This is
the highest result ever attained on this benchmark.
✓PS700 1 socket 4 core
✓PS701 1 socket 8 core
✓PS702 2 socket 16 core
✓3.0 GHz
Power Systems Blades
41
POWER7 PS700 Blade 4 Cores
Fiber Support Yes (via BladeCenter chassis)
Media Bays 1 BladeCenter chassis
Redundant
PowerYes BladeCenter chassis
Redundant
CoolingYes BladeCenter chassis
Service
ProcessorYes
Power &
ThermalPOWER Save / Power Cap
Architecture 4 Core Single Socket
L2 & L3 Cache On Chip
DDR3 Memory Up to 64 GB
DASD / Bays 0 - 2 SAS (300/600GB)
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Integrated
Options
Dual Port Gbt Ethernet
Ethernet, USB
42
POWER7 PS701 Blade 8 Cores
Fiber Support Yes (via BladeCenter chassis)
Media Bays 1 BladeCenter chassis
Redundant
PowerYes BladeCenter chassis
Redundant
CoolingYes BladeCenter chassis
Service
ProcessorYes
Power & Thermal POWER Save / Power Cap
Architecture 8 Core Single Socket
L2 & L3 Cache On Chip
DDR3 Memory Up to 128 GB
DASD / Bays 0 - 1 SAS (300/600GB)
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Integrated
Options
Dual Port Gbt Ethernet
Ethernet, USB
43
POWER7 PS702 Blade 16 Cores
Fiber Support Yes (via BladeCenter chassis)
Media Bays 1 BladeCenter chassis
Redundant
PowerYes BladeCenter chassis
Redundant
CoolingYes BladeCenter chassis
Service
ProcessorYes
Power &
ThermalPOWER Save / Power Cap
Architecture 8 Cores/Socket Two Socket
L2 & L3 Cache On Chip
DDR3 Memory Up to 256 GB
DASD / Bays 0 - 2 SAS (300/600GB)
Daughter Card
Options
CIOv & CFFh
( PCIe Adapters )
Integrated
Options
Quad Port Gbt Ethernet
Ethernet, USB
44
i Edition Express for BladeCenter S
i Edition Express for BladeCenter S
BladeCenter PS700 or JS12
IBM i
PowerVM Express
BladeCenter S
IBM i preloaded
The i Edition Express for BladeCenter S is the perfect
alternative to a traditional rack or tower server with comparable
starting prices and enables clients to run their i applications
and consolidate x86 servers into a single BladeCenter S
chassis that supports up to six blades and over 7 terabytes of
disk storage
Power Systems Virtualization
•Support for multiple operating environmentsHypervisor
•Micro-partitioning, resource movementDynamic Logical
Partitioning
•Cap processor resources for a group of partitionsMultiple Shared Processor Pools
•Virtualizes resources for client partitionsVirtual I/O Server
•Simplifies partition management for entry systemsIntegrated Virtualization
Manager
•Supports x86 Linux applicationsLx86
•Move running AIX and Linux partitionsLive Partition Mobility
•Share a memory pool among partitionsActive Memory Sharing
Power Hypervisor
VIOS
IBM i 7.1 Highlights DB2
Support for XML and column level encryption
PowerHA Async Geographic Mirroring & LUN-level switching
Virtualization IBM i 6.1 virtualization for i 7.1 partitions
Solid State Drives Automatic movement of hot data to SSDs
Workload Capping Limit # of cores used by middleware within a partition
Open Access for RPG Extend application reach to pervasive devices
Zend Server Community Edition PHP environment preloaded with IBM i
Systems Director Richer management of IBM i via Systems Director
Power Systems
VIOS
IBM i 6.1 IBM i 7.1
Traditional IBM i Workload management
IBM i Workload Management Subsystems provide workload isolation Priorities are used to schedule work No way to cap a given application to a subset of the processor resources in
a partition
All workloads can access the full number of Cores in the Partition
Application 3 = 8 Cores
Application 2 = 8 Cores
Application 1 = 8 Cores
IBM i System / Partition
IBM i Workload Capping
IBM i workload capping can control workloads by limiting the number of cores that can be used by an application
Application 3 = 8 cores
Application 2 = 6 Cores
Application 1 = 3 Cores
IBM i System / Partition
Back to the basisBack to the future
IBM i and Cloud Computing• Best platform for “private cloud” • Centralized model• Server consolidation • Bring back the complexity into the “computer
room”• No more “personal” workstations and
company’s data stored on users’ disks• Low-TCO terminals • SOA approach to integrate third party “SaaS”
solutions
50
IBM refreshes CloudBurst line
with Power7 chips - 14 October
2010
A rack with a single Power 750 server
and 32 processor cores can run up to 160
virtual machines, while the top end
system, with 11 Power 750 servers in five
racks, can run up to 2,960 virtual
machines,