presented by: eric carty-fickes
DESCRIPTION
Memory System Characterization of Commercial Workloads L.A. Barroso, K. Gharachorloo and E. Bugnion Western Research Laboratory Digital Equipment Corporation. Presented by: Eric Carty-Fickes. Introduction. commercial workloads > engineering but most still using scientific benchmarks (in 1998) - PowerPoint PPT PresentationTRANSCRIPT
Memory System Characterization of Commercial Workloads
L.A. Barroso, K. Gharachorloo and E. Bugnion Western Research LaboratoryDigital Equipment Corporation
Presented by: Eric Carty-Fickes
Introduction
• commercial workloads > engineering– but most still using scientific benchmarks (in
1998)
• difficult to create commercial benchmarks– large, expensive, proprietary, changing
• paper uses commercial workloads to study current trends
Database Workloads
• first two run on Oracle DB server• OLTP
– small r/w queries on part of DB– models banking req’s in dedicated mode– more kernel time; hides I/O
• DSS (decision support systems)– long read-only queries on much of DB– models wholesaler’s SQL queries– fewer context-switches
Database Workloads
• Web Index Search– doesn’t require DB server– multiple threads hide misses– read-only req’s and cached recent searches
Test Systems
• 4 processor AlphaServer 4100 and 8 processor 8400 for hardware testing– IPROBE tool for event counting– DCPI for profiling– ATOM for studying ORACLE
• SimOS for testing architectural changes– models Alpha 21164– simplified, but still with some detail
Aspects of Testing
• 3 issues: memory size, I/O bandwidth, runtime– scale down DB– change block buffer cache sizes
• OLTP and DSS: need to warm up SGA before testing; need to scale DB to be resident
• Web Index: no scaling – same system
Hardware Results
• OLTP – higher CPI, maybe due to TPC-B– long secondary cache latency– lots of primary cache misses, esp Icache– dirty miss latency significant, lots of communication
• DSS – lower CPI means this config works– only suggestion is larger 1st level caches
• AltaVista – use 8400 just like original– good CPI, well written code– 1st level caches important
Simulator Results
• simulator like hardware, some cache and consistency differences = different timing– close cycle counts, miss rates
• OLTP – test assoc and Bcache size– idle time increase when servers can’t hide I/O– lots of cache intricacies…– bigger caches = fewer replacemt, inst misses – more
important for OLTP than DSS– bigger lines = more true sharing, less cold missing
Conclusions
• scaled OLTP and DSS give a decent estimate of real performance
• fairly narrow range of architectural issues explored
• more processes/processor = less I/O latency, fewer dirty misses
• simulators gloss over important details for ease of use (timing, OS, etc.)
Questions
• Can you get enough information by scaling down the DB and playing tricks with block buffer sizes?