Processor Selection For Optimum Middleware Price / Performance
David A. Kradakra137 at gmail.com
1
Which processor is most cost effective
to run expensive middleware for your
application?2
One Page SummaryExpensive middleware:
• is priced per core differently for different processors;• its costs overshadow server & operating system costs.
1. Assess using Performance per Weighted Core (P/WC) rather than Performance per Core.
2. The observed best to worst ratio exceeds 7x in one case.3. Results vary not only with middleware and application,
but also with the scale.4. For TPC-C on Oracle DBMS, the Hyper-threaded
XEONs have the best:• P/WC rating;• Clock-independent architecture:
P/WC per GHz.5. For TPC-C on Oracle, similar results come from using
IBM’s PVU rather than Oracle’s Core Weighting Factor.
See below forTPC-H and SAPresults as well.
3
Processor Selection For Optimum Middleware Price / PerformanceOne Page Summary
dakra137 at gmail.com
Expensive middleware:• is priced per core differently for
different processors, servers, etc;
• its costs overshadow server & operating system costs.
1. Assess using Performance per Weighted Core (P/WC) rather than Performance per Core.
2. The observed best to worst ratio exceeds 7x in one case.
3. Results vary not only with middleware and application, but also with the scale.
4. For TPC-C on Oracle DBMS, the Hyper-threaded XEONs have the best:• P/WC rating;• Clock rate-independent
architecture: P/WC per GHz.
5. For TPC-C on Oracle, similar results come from using IBM’s PVU rather than Oracle’s Core Weighting Factor.
6. For TPC-H on Oracle:• Itanium• Top @ 10,000• Bottom @ 3,000 and 1,000• Sparc• Top (along with Power) @
3,000• 2nd @ 10,000 and 1,000• Power• Top @ 3,000• Hyper-threaded Xeon• Top @ 1,000
7. SAP on Oracle• Hyper-threaded Xeon on top
8. SAP on DB2• Power7 on top
4
Acknowledgements I wish to thank: • The Transaction Processing Council and SAP for making it so easy to acquire benchmark results in readily usable form; • Oracle and IBM for making their Core Weighting Factor and Processor Value Unit Tables readily available; and especially,• All the staff at all the companies who implemented, tuned, measured, and published the TPC and SAP benchmarks.
5
Key QuestionHow do you take into account“Core Weighting Factors,” “Processor Value Units,” etc.when looking to pick a processor to run EXPEN$IVE middleware?
6
ExamplesOracle Core Weighting
Factor
Ratings Advantage
Example Processor
0.25 4x Oracle SPARC T3
0.5 2xIntel XEON
75xx
0.75 1.33x HP PA-RISC
1 1x IBM POWER67
Answer
Use “Performance per Weighted Core” rather than “Performance per Core.”
8
Performance per Weighted Core
P/WC = P / (Cores x Weight)
[where P often is the maximum TPS rate that meets the response time requirements.]
9
Question #2
So what is the most cost effective processor?
10
Answer #2a
That really does depend on: • the middleware, • the application, and• its scale.
11
Answer #2bThis case study considers: • Middleware: Oracle DBMS• Applications:
•TPC-C•TPC-H
TPC-H scales:•1,000•3,000•10,000 12
Key PointsMiddleware $ >> Everything else that
varies e.g. Server HW & OS
This case study avoided:• Exact pricing, for the reason above• Performing benchmarks• Constraints on publishing results
13
TPC-C
14
Top 20 TPC-C Results (per ENTERPRISE weighted core)
Summary XEON POWER SPARC Itanium
Top 5 TPC-C TPM per weighted core 4 1
Top 10 TPC-C TPM per weighted core 5 4 1
Top 20 TPC-C TPM per weighted core 10 6 1 3
15
Top 20 TPC-C Results (relative cost per EE weighted core)
As of Spring 2011
Intel Xeon Processor X5570 2.93GHz
Intel Quad-Core Xeon E5520 2.26GHz
Intel Xeon E5520 2.27 GHz
IBM POWER6 - 4.7 GHz
Intel Xeon X5650 6-core 2.66GHz
IBM POWER5+ - 2.2 GHz
SPARC T3 1.65GHz
Intel Xeon Quad-Core X5460 - 3.16 GHz
IBM POWER5 - 1.9 GHz
IBM POWER5 - 1.9 GHz
IBM POWER5 - 1.9 GHz
IBM POWER5 - 1.9 GHz
Intel Itanium2 Dual-Core - 1.6 GHz
Intel Xeon X7460 - 2.67 GHz
Intel Xeon QC 5440 - 2.83 GHz
Intel Xeon X5355 - 2.66 Ghz
Intel Xeon X5355 - 2.66 GHz
Intel Itanium2 Dual-Core - 1.6 GHz
Intel Xeon QC 5440 - 2.83 GHz
Intel Itanium2 Dual-Core - 1.6 GHz
Intel Itanium Dual-Core 9150M - 1.66 GHz
Intel Itanium2 - 1.6 GHz
1.0 1.5 2.0 2.5 3.0 3.5 4.0
1.0
1.3
1.4
1.6
1.6
2.0
2.3
2.3
2.3
2.4
2.4
2.6
2.7
3.0
3.0
3.1
3.1
3.1
3.3
3.5
3.7
3.9
Best !!
HighestCosts Almost 4x theBest !!
16
Key PointDon’t be mislead by the quantities of benchmarks!
The quantity of entries represents the willingness of vendors to perform benchmarks and publish results. All it shows is consistency even as other elements may vary, such as chipset, SAN attachment, storage subsystem, etc.17
Top 20 TPC-C Results (per weighted core) (as of Spring 2011)
Processor Cores / Processor
Cache Per Processor
ClockRate
Cache Per Core
TpmC / Weighted
CoreTpmC Cores
WeightedCore
Factor
Intel Xeon X5570 2.93GHz 4 8 2.93 2 157942 631766 8 0.50
Intel QCore Xeon E5520 2.26GHz 4 8 2.26 2 119696 239392 4 0.50
Intel Xeon E5520 2.27 GHz 4 8 2.26 2 116001 232002 4 0.50IBM POWER6 - 4.7 GHz 2 8 4.7 4 101116 404462 4 1.00Intel Xeon X5650 6-core 2.66GHz 6 12 2.66 2 96680 290040 6 0.50IBM POWER5+ - 2.2 GHz 78757 236271 4 0.75SPARC T3 1.65GHz 70022 30249688 1728 0.25Intel Xeon QCore X5460 - 3.16GHz 4 12 3.16 3 68417 273666 8 0.50IBM POWER5 - 1.9 GHz 67813 203440 4 0.75IBM POWER5 - 1.9 GHz 66741 1601785 32 0.75IBM POWER5 - 1.9 GHz 64797 194391 4 0.75IBM POWER5 - 1.9 GHz 61841 371044 8 0.75Intel Itanium2 Dual-Core - 1.6 GHz 57642 230569 4 1.00
Intel Xeon X7460 - 2.67 GHz 6 16 2.67 2.67 53271 639253 24 0.50
Intel Xeon QC 5440 - 2.83 GHz 4 12 2.83 3 52246 104492 4 0.50
Intel Xeon X5355 - 2.66 Ghz 4 8 2.66 2 51227 102454 4 0.50Intel Xeon X5355 - 2.66 GHz 4 8 2.66 2 50463 100926 4 0.50Intel Itanium2 Dual-Core - 1.6 GHz 50207 200829 4 1.00Intel Xeon QC 5440 - 2.83 GHz 4 12 2.83 3 48542 97083 4 0.50Intel Itanium2 Dual-Core - 1.6 GHz 44930 359440 8 1.00
18
Architecture AnalysisFor TPC-C the Hyper-threaded XEONs come out on top because of performance, advantageous core weighting factor, and clock rate.
The Power 6’s are next, due to sheer clock rate, in spite of heaviest weighting.
The SPARC T3 would still be in the top 20 chart, ahead of the some of the XEONs, even if it had the same weighting factor (.5 rather than .25) as the XEONs.
The best Itanium result compares closely to the Power5 results.
While the best XEON rating is 158K, the Itanium results vary from 58K down to 45K TpMC/WC rating units.
19
Source of Top Xeon’s RatingCache? No, the top three outperformed the Xeon’s with larger cache per core and larger total cache. 2MB per core was enough here.
Clock rate? No, the top three outperformed the Xeon’s with higher clock rates.
Hyper-threading? YES!
Core Weighting Factor? YES! 20
21
IBM POWER6 - 4.7 GHz
IBM POWER5+ - 2.2 GHz
SPARC T3 1.65GHz
IBM POWER5 - 1.9 GHz
Intel Itanium2 Dual-Core - 1.6 GHz
Intel Itanium Dual-Core 9150M - 1.66 GHz
UltraSPARC T2 Plus 1.6GHz
Intel Itanium2 Dual-Core 9050 - 1.6 GHz
IBM POWER4 - 1700 MHz
Intel Itanium2 - 1.6 GHz
Intel Itanium2 - 1.5 GHz
Fujitsu SPARC64 - 1.3 GHz
CHART ABOVE: Standard BELOW: Enterprise
Intel Xeon Processor X5570 2.93GHz
Intel Xeon X7460 - 2.67 GHz
Intel Xeon X5650 6-core 2.66GHz
Intel Xeon Quad-Core X5460 - 3.16 GHz
Intel Quad-Core Xeon E5520 2.26GHz
Intel Xeon QC 5440 - 2.83 GHz
Intel Xeon X5355 - 2.66 Ghz
Intel Xeon QC 5440 - 2.83 GHz
Intel Itanium2 - 1.6 GHz
0 2 4 6 8 10 12 14 16 18
1
1.28
1.44
1.49
1.75
2.39
2.54
2.6
3.16
3.16
5.31
5.43
15
1
1.98
2.18
2.31
2.64
6.05
6.17
6.51
15.67
Separate Sections for Standard & Enterprise Editions’ Benchmarks & Weightings
Enterprise Best !!
5.4x $
Standard Best !!
15.7x $
(as of Spring 2011)
TPC-H
22
TPC-H Results Summary(per weighted core)(as of Spring 2011)
Scale 10,000 Scale 3,000 Scale 1,000
1. Itanium (9x40)
SPARC (64 VII) & POWER5
XEON (x5450)
2.SPARC (UltraSPARC IV+)
Opteron (285) SPARC 64 VI
3. Itanium2 Itanium (Itanium2 9050) Itanium
23
TPC-H Results Summary(relative cost per weighted core)
Intel Itanium Dual-Core 9140 - 1.6 GHz
Sun UltraSparc IV+ - 1500 MHz
Intel Itanium2 - 1.5 GHz
SPARC64 VII 2880MHz
Intel Itanium2 DC 9050 - 1.6 GHz
Sun UltraSPARC IV+ - 1800 MHz
Intel Itanium2 - 1.5 GHz
HP PA-RISC 8700 - 875 MHz
Sun UltraSPARC III Cu - 1200 MHz
Intel Quad-Core Xeon X5450 3.0GHz
Intel Itanium Quad-core 9350 - 1.73GHz
AMD Dual Core Opteron Model 875HE - 2.2 GHz
Intel Itanium2 - 1.6 GHz
Intel Itanium2 - 1.6 GHz
AMD Opteron Model 848 - 2.2 GHz
Intel Itanium2 - 1500 MHz
1 2 3 4 5 6 7 8
11.22
1.632.12
2.4211.011.21
1.771.951.972.15
2.962.99
3.713.83.91
5.611
1.852.08
2.372.462.69
4.174.234.28
4.846.23
7.337.4
10,000 Best !!
3,000 Best !!
1,000 Best !!
2.4x $
5.6x $
7.4x $
24
(as of Spring 2011)
Analysis of TPC-H Results
TPC-H Scale Range of Results10,000 1:2.43,000 1:5.61,000 1:7.4
Selecting a suboptimal processor for TPC-H could increase costs
by up to 7.4x.
25
Analysis of TPC-H Results
Scale really matters!• Itanium
• Top @ 10,000• Bottom @ 3,000 and 1,000
• Sparc• Top (along with Power) @ 3,000• 2nd @ 10,000 and 1,000
• Greater gaps between groups of results @ 10,000 and 1,000 than @ 3,000
26
TPC-H 10,000ResultsProcessor
Perf / Weighted
CoreQphH Cores Core
Factor
Intel Itanium Dual-Core 9140 - 1.6 GHz 1629 208458 128 1
Intel Dual Core Itanium 2 9040 - 1.6 GHz 1339 171380 128 1
Sun UltraSparc IV+ - 1500 MHz 1001 108100 144 0.75
Intel Itanium2 - 1.5 GHz 767 49105 64 1
Intel Itanium2 - 1.5 GHz 674 86283 128 1
27
TPC-H 1,000 ResultsProcessor
Perf / Weighted
CoreQphH Core
sCore
Factor
Intel QCore Xeon X5450 3.0GHz 4559 11669
76512 0.5
Sun SPARC64 VI - 2400 MHz 2470 11857
364 0.7
5
Intel Itanium QC 9350 - 1.73GHz 2190 14018
1 64 1
Intel DC Itanium 9140 1.6GHz 1927 12332
364 1
AMD DC Opteron 875HE 2.2GHz 1855 59354 64 0.5
Intel Itanium2 DC 9050 - 1.6 GHz 1696 27144 16 1
Intel Itanium2 - 1.6 GHz 1093 34988 32 1
Fujitsu SPARC64 - 1.3 GHz 1078 34493 64 0.5
Intel Itanium2 - 1.6 GHz 1064 68101 64 1
Intel Itanium2 - 1.6 GHz 942 15070 16 1
AMD Opteron 848 - 2.2 GHz 732 35141 48 1
Intel Itanium2 - 1.3 GHz 622 9951 16 1
Intel Itanium2 - 1500 MHz 616 9853 16 1
28
TPC-H 3,000 ResultsProcessor Perf / Weighted
Core QphH Cores Core Factor
IBM POWER5 - 1.9 GHz 2094 100512 64 0.75
SPARC64 VII 2880MHz 2072 198908 128 0.75
AMD Opteron DC 285 - 2.6 GHz
1728 110577 128 0.5
Intel Itanium2 DC 9050 - 1.6 GHz 1182 37814 32 1
Fujitsu SPARC64 - 1.3 GHz 1073 34345 64 0.5
Sun UltraSPARC IV+ - 1800 MHz
1062 114714 144 0.75
Sun UltraSparc IV+ - 1500 MHz
976 105431 144 0.75
Intel Itanium2 - 1.5 GHz 707 45248 64 1
Intel Xeon MP - 3.0 GHz 700 22388 32 1
HP PA-RISC 8700 - 875 MHz 564 27094 64 0.75
Sun UltraSPARC IV - 1200 MHz 550 59436 144 0.75
Sun UltraSPARC III Cu - 1200 MHz 536 28948 72 0.75
HP PA-RISC 8700 - 750 MHz
373 17908 64 0.7529
TPC-H Results DetailProcessor
Perf / Weighted Core
QphH CoresCore
Factor
Intel Itanium Dual-Core 9140 - 1.6 GHz 1629 208458 1281
Intel Dual Core Itanium 2 9040 - 1.6 GHz 1339 171380 1281
Sun UltraSparc IV+ - 1500 MHz 1001 108100 1440.75
Intel Itanium2 - 1.5 GHz 767 49105 641
Intel Itanium2 - 1.5 GHz 674 86283 1281
IBM POWER5 - 1.9 GHz 2094 100512 640.75
SPARC64 VII 2880MHz 2072 198908 1280.75
AMD Opteron Dual-Core Model 285 - 2.6 GHz 1728 110577 128
0.5
Intel Itanium2 DC 9050 - 1.6 GHz 1182 37814 321
Fujitsu SPARC64 - 1.3 GHz 1073 34345 640.5
Sun UltraSPARC IV+ - 1800 MHz 1062 114714 1440.75
Sun UltraSparc IV+ - 1500 MHz 976 105431 1440.75
Intel Itanium2 - 1.5 GHz 707 45248 641
Intel Xeon MP - 3.0 GHz 700 22388 321
HP PA-RISC 8700 - 875 MHz 564 27094 640.75
Sun UltraSPARC IV - 1200 MHz 550 59436 1440.75
Sun UltraSPARC III Cu - 1200 MHz 536 28948 720.75
HP PA-RISC 8700 - 750 MHz 373 17908 640.75
Intel Quad-Core Xeon X5450 3.0GHz 4559 1166976 5120.5
Sun SPARC64 VI - 2400 MHz 2470 118573 640.75
Intel Itanium Quad-core 9350 - 1.73GHz 2190 140181 641
Intel Dual-Core Itanium 9140 1.6GHz 1927 123323 641
AMD Dual Core Opteron Model 875HE - 2.2 GHz 1855 59354 64
0.5
Intel Itanium2 DC 9050 - 1.6 GHz 1696 27144 161
Intel Itanium2 - 1.6 GHz 1093 34988 321
Fujitsu SPARC64 - 1.3 GHz 1078 34493 640.5
Intel Itanium2 - 1.6 GHz 1064 68101 641
Intel Itanium2 - 1.6 GHz 942 15070 161
AMD Opteron Model 848 - 2.2 GHz 732 35141 481
Intel Itanium2 - 1.3 GHz 622 9951 161
Intel Itanium2 - 1500 MHz 616 9853 161
30
Futures
•Clock rate will grow
•Core weightings will change
•Different architecture
31
Clock Rate Will Grow
Compare architectures.Factor out clock rate, so you can estimate new higher clock rate chips’ performance.Use:Performance per Weighted
Core per GHz32
TPC-C Performance per Weighted Core per GHz
ProcessorRelative Weighted
Architecture & Design Rating
TpmC / Weighted Core / GHz
Clock (GHz)
Perf / Weighted
CoreTpmC
Xeon X5570 2.93GHz 1.00 53905 2.93 157942 631766
SPARC T3 1.65GHz 0.79 42438 1.65 70022 30249688
Xeon X5650 6C 2.66GHz 0.67 36346 2.66 96680 290040
Itanium2 DC 1.6 GHz 0.67 36026 1.6 57642 230569
POWER5+ 2.2 GHz 0.66 35799 2.2 78757 236271
POWER5 1.9 GHz 0.66 35691 1.9 67813 203440
Itanium2 DC - 1.6 GHz 0.58 31380 1.6 50207 200829
Xeon QC X5460 - 3.16 GHz 0.40 21651 3.16 68417 273666
POWER6 4.7 GHz 0.40 21514 4.7 101116 404462
Xeon X7460 - 2.67 GHz 0.37 19952 2.67 53271 639253
Similar rows omitted 33
Top Result
For TPC-C, Hyper-threaded Xeon’s rate highest at Performance per Weighted Core per GHz.
34
Core Weightings Will Change
Hypothetically:What would happen if Oracle adopted core weightings proportional to IBM’s Processor Value Units (PVU’s)?
35
Answer
For TPC-C:
•Hyper-threaded Xeon’s stay on top
•SPARC drops 24 steps•Power rises 1 step and drops 1, 3, 5, & 6 steps
•Itanium rises 4 & 5 steps.36
TPC-C Ranking Shifts Due to Alternative Weightings
Server CPU TypeOrder due to Oracle
Core Weighting Factors
Order due to IBM Processor Value
Units
Order Difference
Intel Xeon Processor X5570 2.93GHz 1 1 0
Intel Xeon E5520 2.27 GHz 3 4 -1
IBM POWER6 - 4.7 GHz 4 3 1
Intel Xeon X5650 6-core 2.66GHz 5 5 0
IBM POWER5+ - 2.2 GHz 6 7 -1
SPARC T3 1.65GHz 7 31 -24
Intel Xeon Quad-Core X5460 - 3.16 GHz 8 6 2
IBM POWER5 - 1.9 GHz 9 12 -3
IBM POWER5 - 1.9 GHz 10 15 -5
IBM POWER5 - 1.9 GHz 12 18 -6
Intel Itanium2 Dual-Core - 1.6 GHz 13 8 5
Intel Xeon X7460 - 2.67 GHz 14 9 5
Intel Xeon X5355 - 2.66 GHz 17 13 4
Intel Itanium2 Dual-Core - 1.6 GHz 18 14 4
Intel Xeon QC 5440 - 2.83 GHz 19 17 2
Similar rows omitted 37
Different Architecture
Q: What will core weighting factors and PVU’s be for the new AMD Bulldozer processors?(They are somewhat less than 2 cores per processor building block.)
A: I don’t know.
38