understanding performance benchmarks - dell performance benchmarks ... bapco, tpc, and the storage...

3
POWER SOLUTIONS June 2004 88 Understanding Performance Benchmarks Benchmarks provide objective information that can be used to compare computer plat- forms, components, operating systems, and specific system configurations. This article discusses characteristics of credible benchmarks, guidelines for evaluating benchmark results, and some of the main benchmarks used at Dell for assessing the performance of server, workstation, and client systems. A t their best, performance benchmarks provide impar- tial information that can be used to evaluate and compare the performance of computer systems. Dell and the computer industry promote objective and credible benchmarking in various ways, including participation in standards bodies such as the Standard Performance Eval- uation Corporation (SPEC), Business Applications Perfor- mance Corporation (BAPCo), Transaction Processing Performance Council (TPC), and Storage Performance Council. When properly run and documented, the bench- marks produced by these and other groups help provide objective information that can be used to compare com- puter platforms, components, operating systems, and spe- cific system configurations. Dell is committed to furthering industry practices that yield objective industry-standard benchmark results. Orga- nizations can use these benchmarks to evaluate and com- pare Dell systems to competitors’ systems. Dell also uses the benchmarks when developing new products and assessing new technologies. The Dell benchmark philosophy is based on three tenets: Benchmark in a way that closely resembles how organizations use applications on Dell systems Ensure that anyone can reproduce results with a system shipped directly from Dell, using publicly available drivers Promote benchmark and run-rule changes that reflect this approach to benchmarking This article discusses characteristics of credible bench- marks and presents high-level guidelines for evaluating benchmark results. It concludes with a list of the key benchmarks used at Dell to evaluate server, workstation, and client system performance. Characteristics of credible performance benchmarks A computer performance benchmark is a standard by which a computer system can be measured and judged. Many of the well-known benchmarks are developed and regulated by standards organizations such as SPEC and BAPCo. Just as common are unregulated benchmarks that measure system performance when running specific appli- cations such as Adobe ® Photoshop ® , Microsoft ® Exchange, Parametric ® Pro/E ® , or Id Software ® Quake III ® software. These benchmarks can help administrators evaluate system performance on a single, critical application such as Pro/E or Microsoft Exchange. Such benchmarks can be run—and their results reported—with varying degrees of flexibility. SYSTEM ARCHITECTURE BY SHARON HANSON, DIEGO ESTEVES, AND CLINT ESPINOZA

Upload: vanque

Post on 03-Jul-2018

217 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Understanding Performance Benchmarks - Dell Performance Benchmarks ... BAPCo, TPC, and the Storage Performance Council are also nonprofit corporations that …

POWER SOLUTIONS June 200488

Understanding

Performance BenchmarksBenchmarks provide objective information that can be used to compare computer plat-forms, components, operating systems, and specific system configurations. This articlediscusses characteristics of credible benchmarks, guidelines for evaluating benchmarkresults, and some of the main benchmarks used at Dell for assessing the performanceof server, workstation, and client systems.

At their best, performance benchmarks provide impar-

tial information that can be used to evaluate and

compare the performance of computer systems. Dell and

the computer industry promote objective and credible

benchmarking in various ways, including participation in

standards bodies such as the Standard Performance Eval-

uation Corporation (SPEC), Business Applications Perfor-

mance Corporation (BAPCo), Transaction Processing

Performance Council (TPC), and Storage Performance

Council. When properly run and documented, the bench-

marks produced by these and other groups help provide

objective information that can be used to compare com-

puter platforms, components, operating systems, and spe-

cific system configurations.

Dell is committed to furthering industry practices that

yield objective industry-standard benchmark results. Orga-

nizations can use these benchmarks to evaluate and com-

pare Dell™ systems to competitors’ systems. Dell also uses

the benchmarks when developing new products and

assessing new technologies.

The Dell benchmark philosophy is based on three

tenets:

• Benchmark in a way that closely resembles how

organizations use applications on Dell systems

• Ensure that anyone can reproduce results with a

system shipped directly from Dell, using publicly

available drivers

• Promote benchmark and run-rule changes that

reflect this approach to benchmarking

This article discusses characteristics of credible bench-

marks and presents high-level guidelines for evaluating

benchmark results. It concludes with a list of the key

benchmarks used at Dell to evaluate server, workstation,

and client system performance.

Characteristics of credible performance benchmarksA computer performance benchmark is a standard by

which a computer system can be measured and judged.

Many of the well-known benchmarks are developed and

regulated by standards organizations such as SPEC and

BAPCo. Just as common are unregulated benchmarks that

measure system performance when running specific appli-

cations such as Adobe® Photoshop®, Microsoft® Exchange,

Parametric® Pro/E®, or Id Software® Quake III® software.

These benchmarks can help administrators evaluate system

performance on a single, critical application such as Pro/E

or Microsoft Exchange. Such benchmarks can be run—and

their results reported—with varying degrees of flexibility.

SYSTEM ARCHITECTURE

BY SHARON HANSON, DIEGO ESTEVES, AND CLINT ESPINOZA

088-90 Hanson.qxd 6/16/04 3:01 PM Page 88

Page 2: Understanding Performance Benchmarks - Dell Performance Benchmarks ... BAPCo, TPC, and the Storage Performance Council are also nonprofit corporations that …

In contrast, regulated benchmarks tend to have well-defined and

documented methodologies, and their results are documented and

reproducible. A good example is the SPEC® CPU2000 benchmark,

which is produced by SPEC, a nonprofit corporation. According to

SPEC, the organization’s mission is to establish, maintain, and

endorse a standardized set of relevant benchmarks. SPEC develops

suites of benchmarks and also reviews and publishes submitted

results from member organizations and other benchmark licensees.1

The SPEC organization has industry-wide representation and its

benchmark suites are well accepted and credible.

The SPEC CPU2000 benchmark provides performance meas-

urements that can be used to compare compute-intensive work-

loads (both integer and floating point) on different computer

systems. These compute-intensive benchmarks measure the perfor-

mance of a system’s processor,

memory architecture, and com-

piler. CPU2000 consists of a set

of objective tests that must be

compiled and run according to

SPEC run rules. SPEC provides

the benchmarks as source code

so they can be compiled to run

on a variety of platforms, includ-

ing industry-standard Intel®

architecture–based systems and

SPARC® processor–based Sun™

systems.

In addition, SPEC provides

guidelines for legitimately opti-

mizing the performance of tested

systems on the benchmark. These

guidelines are designed to ensure that the hardware and software

configurations of tested systems are suitable to run real-world appli-

cations. The organization also requires a full disclosure report,

which provides benchmark results and configuration details suffi-

cient to independently reproduce the results. SPEC encourages

submission of reports for publication on the SPEC Web site

(http://www.spec.org). These reports undergo a peer-review process

before publication. Because of these rigorous requirements, CPU2000

benchmark results that are published on the SPEC Web site are

widely used to compare the CPU, memory, and compiler perfor-

mance of client and server systems.

BAPCo, TPC, and the Storage Performance Council are also

nonprofit corporations that provide industry-standard benchmarks

widely used to compare the performance of client, server, and stor-

age systems. TPC was founded to define transaction processing and

database benchmarks. The BAPCo charter is to develop and distribute

a set of objective performance benchmarks based on popular com-

puter applications and industry-standard operating systems. The

goal of the Storage Performance Council is to define, promote, and

enforce vendor-neutral benchmarks that characterize the perfor-

mance of storage subsystems.

Guidelines for evaluating benchmark resultsWhen using benchmark results to evaluate and compare systems,

administrators should understand the benchmark, be aware of

system optimizations, and ensure comparable system compar-

isons, as follows.

Understand the benchmarkIt is essential to understand which aspects of system performance

a benchmark is testing as well as what the system’s workload will

be. Those who are evaluating benchmarks should consider whether

the benchmark workload is reasonably representative of the real-

world applications that will be run on the system. For instance, if

a client system will be used to run mainstream business produc-

tivity applications, the BAPCo SYSmark® or Ziff Davis® Business

Winstone® benchmarks are good candidates.2 On the other hand,

if the test subject is a workstation system that will be used prima-

rily to run Pro/E, the Pro/E application benchmark is suitable. If

possible, those who are evaluating benchmarks should focus on reg-

ulated benchmarks from standards bodies such as SPEC and BAPCo

or on benchmarks that are standard industry applications.

Application benchmarks can be run with a variety of inputs, each

of which attempts to represent different usage scenarios. For exam-

ple, Adobe Photoshop performance varies greatly depending on the

size of the image and the operations performed on it. Moreover, some

Photoshop operations may be better suited or optimized for a par-

ticular system architecture. Even within a particular operation (such

as the Gaussian Blur filter), the end user may be able to modify how

the filter is applied. Different code algorithms may be used, result-

ing in significantly different performance results. These variables make

it relatively easy to create a suite of Photoshop benchmark opera-

tions that greatly favor a particular system architecture. For this

reason, Dell recommends that organizations look beyond summary

benchmark results to help ensure that the operations performed are

representative of their specific usage models.

Be aware of system optimizationsSome optimization of the tested system is expected and allowed on

all benchmarks. SPEC outlines broad optimization guidelines in its

run rules for each benchmark. The expectation of these guidelines

SYSTEM ARCHITECTURE

www.dell.com/powersolutions POWER SOLUTIONS 89

1 For more information about SPEC, visit http://www.spec.org.2 For more information about the BAPCo SYSmark benchmark, visit www.bapco.com; for more information about the Ziff Davis Winstone benchmark, visit http://www.veritest.com/benchmarks/bwinstone/default.asp.

Those who are evaluating

benchmarks should

consider whether the

benchmark workload is

reasonably representative

of the real-world

applications that will be

run on the system.

088-90 Hanson.qxd 6/16/04 3:01 PM Page 89

Page 3: Understanding Performance Benchmarks - Dell Performance Benchmarks ... BAPCo, TPC, and the Storage Performance Council are also nonprofit corporations that …

is to avoid optimizations that are so extreme as to render the system

unsuitable for real-world applications. For example, when running

SPEC benchmarks, Dell often uses publicly available compilers that

support new CPU features. These features can improve system perfor-

mance and better demonstrate the capability of Dell systems. This

practice conforms to the spirit of the SPEC guidelines. The compil-

ers are publicly available to software developers to use in building

their own applications; therefore, the benchmark results are repre-

sentative of possible real-world applications.

In contrast, it is not uncommon for a benchmark to be run on

a system that has been specially tuned to do well on the bench-

mark. Such tuning can be so extreme that the benchmark results

are neither credible nor useful. Even regulated benchmarks can be

misused in this way, so it is important that the benchmark results

include complete configuration information for the tested system.

Ensure comparable system comparisonsWhen comparing the benchmark results of systems from multiple

vendors, test engineers should ensure that the tested systems and

their benchmark settings are comparable. This requires organizations

conducting benchmark tests to supply adequate documentation for

system and benchmark configurations.

Benchmarking at Dell Dell uses benchmarks throughout the technology assessment and

systems development process to help ensure that Dell server and

client systems provide the appropriate balance of performance,

features, cost, quality, and reliability. Dell supports industry efforts

to standardize performance benchmarks and is an active partici-

pant in all the standards bodies discussed in this article. Figures 1

and 2 list key benchmarks that Dell uses to evaluate the performance

of server and client systems.

When used appropriately, benchmarks can provide valuable

information that can help administrators compare and evaluate

computer systems. In addition to benchmarks, many factors should

weigh heavily in the evaluation process, including features, support,

and price, as well as the ability to service, upgrade, and manage

the system under consideration.

Sharon Hanson (sharon_ [email protected]) is a technical writer in the office of the Dell

CTO. She has written and produced Dell white papers and technical articles on industry tech-

nology trends for the past eight years. Sharon has a B.B.A. from The University of Texas at Austin.

Diego Esteves ([email protected]) is a systems engineer and consultant currently

working on Dell Precision™ workstation SPEC performance and independent software vendor

(ISV) application certifications. Diego has a B.S.B.A. from Xavier University in Cincinnati,

Ohio. He currently represents Dell on the SPEC CPU subcommittee, the body responsible for

the industry-standard SPEC CPU2000 benchmarks.

Clint Espinoza ([email protected]) is a storage performance engineer specializing

in RAID adapter performance. Clint has a B.A. from Trinity University in San Antonio, Texas.

SYSTEM ARCHITECTURE

POWER SOLUTIONS June 200490

Workload type Benchmark

Database • Online transaction processing (OLTP): TPC-C • Decision support: TPC-H and TPC-R • Java™: SPECjbb�

Messaging • Microsoft Exchange: MAPI (Message Application Programming Interface) Messaging Benchmark 2 (MMB2) and MMB3 • Lotus Notes�: Notesbench� • Simple Mail Transport Protocol (SMTP)/Post Office Protocol 3 (POP3): SPECmail� 2001

Web services • Hypertext Transfer Protocol (HTTP): SPECweb�99 • HTTP over SSL (HTTPS): SPECweb99_SSL

File and print • Ziff Davis NetBench� services • SPECsfs�

Storage • SPC Benchmark 1™ (SPC-1) • Iometer

CPU and • SPEC CPU2000high-performance • Linpackcomputing • NASTRAN� • STREAM • Hierarchical INTegration (HINT)

Microbenchmarks* • LMbench • Netperf

*A microbenchmark measures one specific feature of a system isolated from other features.

Figure 1. Typical server benchmarks

Workload type Benchmark

Business productivity • SYSmark 2004 • Content Creation Winstone 2004 • Business Winstone 2004

Mainstream 3-D performance • Futuremark� 3DMark� CPU, memory subsystem, • SPEC CPU2000and compiler • Linpack Gaming • Quake III • Epic Games� Unreal Tournament 2003 • Ubisoft™ Splinter Cell� Portable computer • Business Winstone 2002 BatteryMark�battery life • BAPCo MobileMark� 2002

3-D graphics • SPECviewperf� 7.1

Mechanical computer-aided • SPECapc for Pro/ENGINEER™

design (MCAD) • NASTRAN

2-D graphics • Photoshop • Autodesk� AutoCAD�

Figure 2. Typical client system benchmarks

FOR MORE INFORMATION

SPEC: http://www.spec.org

BAPCo: http://www.bapco.com

TPC: http://www.tpc.org

Storage Performance Council: http://www.storageperformance.org

088-90 Hanson.qxd 6/16/04 3:01 PM Page 90