hard disk ram database systems and modern cpu architecture · ¥current and future hardware trends...
TRANSCRIPT
![Page 1: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/1.jpg)
Database Systemsand
Modern CPU Architecture
Prof. Dr. Torsten Grust
Summer term 2009
b_boe^oa=h^oip
rkfsbopfqûq
q§_fkdbk
1
© 2009 • Prof. Dr. Torsten Grust Database Systems and Modern CPU Architecture2
Hard Disk RAM!
2
![Page 2: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/2.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Administrativa
3
• Lecture hours (@ Sand 6/7, gr. Hörsaal):
Monday,10:15 – 11:45
No lecture on June, 29, 2009
• Tutorial/Lab (Jan Rittinger, @ Sand 6/7, kl. Hörsaal):
Wednesday, 13:15 – 14:45
3
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Administrativa • Course homepage:
http://www-db.informatik.uni-tuebingen.de/teaching/ss09/dbcpu
• Contact:Torsten Grust [email protected] Jan Rittinger [email protected]
Rooms: B318, B312 (drop in if doors open)
4
4
![Page 3: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/3.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Course Prerequisites
• These courses will be helpful in following this course but are not strictly (or even formally) required:
1. “Technische Informatik”CPU architecture, assembly, memory hierarchy
2. “Datenbanken II”Query processing, bu!er mgmt, index structures
5
5
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Assembly Language
6
• Here and there we will analyze snippets of (mostly MIPS-style) assembly language programs.
LD R1,0(R2) ;Regs[R1]!M[Regs[R2]+0]DSUB R4,R1,R5 ;Regs[R4]!Regs[R1]-Regs[R5]AND R6,R1,R7 ;Regs[R6]!Regs[R1]&Regs[R7]ORI R8,R1,255 ;Regs[R8]!Regs[R1]|255
6
![Page 4: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/4.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Reading Material• The CPU architecture and memory hierarchy aspects of
this course are largely covered by
Computer Architecture, 3rd or 4th edA Quantitative ApproachJohn L. Hennessy, David A. PattersonMorgan Kaufmann, 2003
(Chapters 1–5, Appendix A)
7
7
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Reading Material • Aspects of database technology are mainly discussed
in a number of research papers.References will be given here, download the papers from the course homepage.(Helps to appreciate the details but not necessary to pass the exam.)
8
8
![Page 5: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/5.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Tutorials & Assignments
• Tutorial sessions will try to be as “hands-on” as possible:
- MonetDB- SPIM (MIPS CPU simulator)- Mini programming exercises (language: C)- CPU performance and event counting, etc.
• We will hand out weekly assignments. There will be no grading—but Jan will develop and discuss solutions
9
9
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Examination
• Examination (Klausur / Kolloquium):
Monday, July 20, 2009 10:15–11:45 @ Sand 6/7
No formal prerequisites to take the exam (although it is highly advisable to actively work on the assignments).
10
10
![Page 6: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/6.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Hard Disk RAM
• Today, it is perceivable to build database systems that primarily operate in main memory.In such systems, there is no central role for (disk) I/O management any longer.
• Instead, main memory database systems (MMDBMS) performance would be determined by other system components: the CPU and the memory hierarchy.
11
!
11
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
A Database in Primary Memory?
• Commodity hardware typically comes with primary memory sizes beyond 2 GB.
• Since the principle of locality applies to programs and data (“90% of all database operations touch 10% of the data”), most database hot sets easily "t into RAM.
• Even further: The author of “A Database in the CPU Cache” might come to Tübingen and try to convince you that a DBMS needs a tiny fraction of RAM, only.
12
12
![Page 7: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/7.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
The Principle of Locality1. Temporal Locality:
Recently accessed items are likely to be addressed in the future.
2. Spatial Locality:Items whose addresses are near one another tend to referenced close together in time.
• Based on recent past, we can predict—with reasonable accuracy—which data will be touched (read/written) in the near future.
13
13
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
I/O Latency Dominates Everything
14
10 000!/min
14
![Page 8: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/8.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Lack of I/O Latency...• ... promises fabulous performance "gures for
MMDBMS.
• MMDBMS, like MonetDB (CWI Amsterdam), indeed exhibit query performance improvements of two orders of magnitude (factor 100) over commercial disk-based DBMS.
• But! The MMDBMS internals need to be carefully engineered to realize this potential.
15
15
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
MonetDB:Binary Relations Only
• Designed as a relational MMDBMS from the ground up, many design decisions in MonetDB seem peculiar.
• All tables exactly have two columns (binary relations).
• These columns are named head (h) and tail (t). Most operators (e.g., select()) implicitly act on the head (tail) column of a table.
16
16
![Page 9: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/9.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
MonetDB:Binary Relations Only
17
17
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
MonetDB: Design Decisions
• Details of CPU and main-memory architecture drove the development of MonetDB:
1. The narrower the tuples, the more tuples will "t into a tiny fraction of RAM (e.g., the CPU cache).
2. Primitive operators spend less CPU cycles per tuple and behave in a predictable fashion.
18
18
![Page 10: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/10.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
CPU and Memory Performance Diverges
• Since 1986, CPU performance improved by a factor of 1.55/year (55%/year).
• DRAM (Dynamic RAM) access speed improves by about 7%/year.
! Modern CPUs spend larger and larger fractions of time to wait for memory reads and writes to complete (memory latency).
19
19
© 2009 • Prof. Dr. Torsten Grust Database Systems and Modern CPU Architecture
The CPU–MemorySpeed Gap
20
20
![Page 11: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/11.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Principle of Locality Comes to the Rescue
21
• Design a hierarchical memory system, based on memories of di!erent speed and sizes.
21
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Memory Access —The New Bottleneck
• Memory access beyond the CPU cache is easily worth 100s of CPU instructions — accessing disk-based memory accounts for 1 million instructions.
• Current and future hardware trends make this worse.
! If the DBMS needs to perform costly memory access,
1. make sure to use all data moved into the cache/CPU,
2. try to access memory in a predictable fashion (prefetching).
22
22
![Page 12: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/12.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Instruction-Level Parallelism
• Modern CPUs — e.g., Intel’s Itanium™ 2 or Pentium™ 4 — feature execution pipelines which ideally can complete # 1 instruction per cycle (IPC):
1. Itanium 2 – max 6 instructions execute in 7-stage pipeline: 6$7 = 42 instructions execute in parallel
2. Pentium 4 – max 3 instructions execute in 31-stage pipeline: 3$31 = 93 instructions execute in parallel
• Such a high degree of parallelism cannot always be found in (database) code.
23
23
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Tracing MySQL• In a simple SQL query like the following, MySQL will
call a dedicated routine to perform the addition for each tuple individually:
• The query engine "rst uses helper routines — like rec_get_nth_field() — to copy data in and out of MySQL’s internal record representation.
24
SELECT A + BFROM R
24
![Page 13: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/13.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Slow Addition in MySQL • An inherent problem of the MySQL query engine is its
one-tuple-at-a-time approach:
- Each invocation experiences its data dependencies in isolation — no potential parallelism.
25
foreach r ! R {
"
s := Item_func_plus_val(r.A,r.B);"
}
25
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Tracing MySQL
• The addition itself, performed by routine Item_func_plus::val(), is found to take % 50 CPU cycles:
- Calling and returning fromItem_func_plus::val() accounts for % 30 CPU cycles.
- Addition consumes the remaining CPUcycles.
26
26
![Page 14: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/14.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Data Dependencies• Trace was performed on MIPS R12000 CPU:
- Can perform 3 ALU (arithmetic) and 1 load/store operation/cycle. Avg. instruction latency: 5 cycles.
27
LD R1,<src1> ; R1!<src1>
LD R2,<src2> ; R2!<src2>
ADD R3,R2,R1 ; R3!R1+R2
SD R3,<dst> ; <dst>!R3 data
dep
ende
ncy
27
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Loop Unrolling• Unrolling the tuple-at-a-time loop and expanding the
code for Item_func_plus::val() reveals that there is no data dependency between additions of di!erent tuples:
28
"
s[n] := r[n].A + r[n].B;s[n+1] := r[n+1].A + r[n+1].B;s[n+2] := r[n+2].A + r[n+2].B;"
28
![Page 15: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/15.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Instruction Scheduling• Let the CPU or the compiler schedule dependent
instructions such that instruction latency is hidden:
29
LD R1,<src1>LD R2,<src2>NOPADD R3,R1,R2LD R1,<src3>LD R2,<src4>SD R3,<dst1>ADD R3,R1,R2LD R1,<src5>LD R2,<src6>SD R3,R1,R2"
One addition completes every 4 CPU cycles.
29
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Course Syllabus (1)• Chapter 0: Introduction and Motivation
• Chapter 1: CPU Architecture and Instruction Sets
- CPU performance, instruction set principles, RISC
• Chapter 2: Pipelining and Instruction-Level Parallelism (ILP)
- CPU pipelines, data and control hazards, parallelism, instruction scheduling, branch prediction, super-scalar CPUs
30
30
![Page 16: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/16.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Course Syllabus (2)
• Chapter 3: Database Systems: Where Does Time Go? (Part I)
- CPU usage, stalls, and misprediction in DBMSs
• Chapter 4: How Database Systems Can Take Advantage of ILP
- Vectorized processing, SIMD instructions, predictable code, compression [MonetDB, X100]
31
31
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Course Syllabus (3)
• Chapter 5: The Memory Hierarchy (Close to the CPU)
- Caches, (reducing) miss rate and penalty, loop reorganization, virtual memory, TLBs
• Chapter 6: Database Systems: Where Does Time Go? (Part 2)
- Memory access behavior of database operators, impact of data layout
32
32
![Page 17: Hard Disk RAM Database Systems and Modern CPU Architecture · ¥Current and future hardware trends make this worse .! If the DBMS needs to perform costly memory access, 1.make sure](https://reader035.vdocuments.site/reader035/viewer/2022070900/5f3992b45e93a808643802e8/html5/thumbnails/17.jpg)
Database Systems and Modern CPU Architecture© 2009 • Prof. Dr. Torsten Grust
Course Syllabus (4)• Chapter 7: How Database Systems Can Exploit the
Memory Hierarchy
- Data placement, column storage, database operation bu!ering, prefetching, compiler techniques [MonetDB, X100]
33
33