self-tuning memory management of a database system

IBM T.J. Watson Research Center

Sigmetrics 2008 Tutorial: Introduction to Control Theory and Its Application to Computing Systems

Self-Tuning Memory Management of A Database System

Yixin Diao

[email protected]


© 2008 IBM Corporation2 SIGMETRICS 2008: Introduction to Control Theory. Abdelzaher, Diao, Hellerstein, Lu, and Zhu.

DB2 Self-Tuning Memory Management Technical problems

– Large systems with varying workloads and many configuration parameters

– Autonomic computing: systems self-management

DB2 UDB Server

Agents

Memory pools

Disks

DB2Clients

Memory pools Challenges from systems aspects

– Heterogeneous memory pools

– Dissimilar usage characteristics

Challenges from control aspects

– Adaptation and self-design

– Reliability and robustness



Load Balancing for Database Memory

ResourceConsumer 1

ResourceConsumer N

LoadBalancer

Measured Output N

Measured Output 1

Resource

Resource Allocation 1

Resource Allocation N

Load Balancing

• Fairness optimal ?

• Common measured output ?

0 1000 2000 3000 4000 500000.02

0.040.06

0.08

0.10.12

0.14

0.16

Entry size (Page)

Ben

efit

(sec

/pag

e)

OLTP

Saved System Time (xi )

simPages

savedTime

BenefitPerPage (yi

)

Memory Pool Size (ui )

ii

ii

uqii

i

ii

uqii

eqpdu

dxy

epx

1



Constrained Optimization and Regulatory ControlSaved Disk Time ( xi )

MemoryPool1

Mem pool 1 (x1)

Overall

Saved System Time (xi )

Optimal memory allocation

BenefitPerPage (y1)

Mem pool 2 (x2)

Mem size 1 (u1)

Mem size 2 (u2)

0,,,

0,,,

,,,

21

121

21

iiN

N

iiN

N

buuuuh

Uuuuug

uuufJ

iiiiii

iii

N

NN

bubu

u

f

u

L

uuuh

uuuguuufL

if 0 ; if 0

0

21

2121

,,,

,,,,,,

Constrained Optimization Karush-Kuhn-Tucker conditions

d1(k)

-+++

Load

-+

+

Resource

1N,1N

dN(k)

y1(k)

yN(k)

e1(k)

eN(k)uN(k)

u1(k)

I

I

w(k)

++

++

d1(k)O

dN(k)O

w1(k)

wN(k)

BalancerResource

ConsumerN

01

1

N

j ji u

f

Nu

f

Regulatory Control

n

iixJ

1



Dynamic State Feedback Controller

State space model

Control error

Integral control error

Feedback control law

kdkuBkAyky I1

kdkyIN

ke ONN

,1

1

kekeke II 1

keKkeKku IIP



Incorporating Const of Control into Controller Design

Disk

Memory Pool A

before

after

write dirty pages to disk

Remove these pages

Memory Pool B

before

allocate extra memoryOS

Major cost: write dirty, move memory, victimize hot

Linear quadratic regulation (LQR)

J = [eT(k) eTI(k)] Q [eT(k) eT

I(k)]T + uT(k) R u(k)

Define Q and R regarding to performance

• Cost of transient load imbalances

• Cost of changing resource allocations

0 10 20 30 40 50 60 70 80 90 1000

200

400

600

800

1000

1200

1400

1600

Interval

Ent

ry s

ize

(MB

)

hc11-21

0 20 40 60 80 100 1200

0.005

0.01

0.015

0.02

0.025

0.03

0.035

0.04

0.045

0.05

Interval

Ben

efit

hc11-21

0 20 40 60 80 100 120 140 1600

200

400

600

800

1000

1200

1400

1600

Interval

Ent

ry s

ize

(MB

)

hc11-17

0 20 40 60 80 100 120 140 160 1800

0.01

0.02

0.03

0.04

0.05

0.06

Interval

Ben

efit

hc11-17

0 50 100 1500

200

400

600

800

1000

1200

1400

1600

Interval

Ent

ry s

ize

(MB

)

h11-12b

0 50 100 1500

0.02

0.04

0.06

0.08

0.1

0.12

Interval

Ben

efit

h11-12bPool Size Benefit

Ts=12449

Ts=15703

Ts=24827



Adaptive Controller DesignDecentralized integral controlLocal linear model

DB2 Memory Pool

DB2 Clients

MemoryStatisticsCollector

Response Time Benefit

MIMO Control Algorithm

MIMO Control Algorithm

Fixed Step

4-Bit(Oscillation)

ModelBuilder

ModelBuilder

Acc

ura

teA

ccu

rate

IntervalTuner

IntervalTuner

Y

N

Entry Size

Entry Size

Step Tuner

Response Time Benefit

Greedy

(Constraint)



Experimental Assessmentsquid.torolab.ibm.comMachine: IBM7028-6C4CPU: 4x 1453MHzMemory: 16GBDisk: 25x 9.1G

OLTP workload: multiple (20) buffer pools

0 50 100 150 2000

0.01

0.02

0.03

0.04

0.05

Response time benefits

0 50 100 150 2000

0.5

1

1.5

2x 104

Memory sizes

0 50 100 150 2000

100

200

300

ThroughputIncrease TP from ~100 to ~250

Increase TP from ~100 to ~250

DSS workload: various query lengths

0 20 40 60 800

200

400

600

800

Interval

Ent

ry s

ize

(MB

)

hc12-10

STMM tuningTs = 10680s

0 20 40 60 800

0.005

0.01

0.015

0.02

0.025

Interval

Ben

efit

hc12-100 20 40 60 80

0

200

400

600

800

Interval

Ent

ry s

ize

(MB

)

hc09-09

ConfigAdvisor settings

Ts = 26342s

0 20 40 60 800

0.005

0.01

0.015

0.02

0.025

Interval

Ben

efit

hc09-09

> 2x improvement> 2x improvement

DSS workload: index drop

Execution time for Query 21 (10 stream avg)

0

1000

2000

3000

4000

5000

6000

7000

1 2 3 4 5 6 7 8 910111213141516171819202122232425262728293031323334Order of execution

Tim

e i

n s

ec

on

ds

avg= 959

avg= 2285

avg= 6206

Some indexes dropped 0 20 40 60 80 100 120 140 160 180

0

500

1000

1500

Interval

Ent

ry s

ize

(MB

)

hc11-05

Reduce 63%Reduce 63%



Comparing Control and Optimization Techniques

Control-based approach Optimization-based approach

Similarity in a simplified scenario Differences in design considerations

Step length (modified Armijo rule)

Projected gradient (quasi-Newton)

Gradient method

Constraint enforcement (projection method)

Decentralized integral control

Local linear model

“Pure” average vs. convex sum

Pole location vs. Armijo rule

Steady-state gain vs. Hessian matrix

Less dependence on the modelLess dependence on the modelStrictly applies constrained optimizationStrictly applies constrained optimization



Simulation Study: Comparison with Optimization Approach

Control-based approach Optimization-based approach

More robust and better uncertainty managementMore robust and better uncertainty management Faster convergence, but more sensitive to noiseFaster convergence, but more sensitive to noise

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2x 10

4

u

PI

0 20 40 60 80 100 120 140 160 180 200150

200

250

300

350

J

k

Without noise (single run)

0 20 40 60 80 100 120 140 160 180 2000

0.5

1

1.5

2x 10

4

u

PI

0 20 40 60 80 100 120 140 160 180 200150

200

250

300

350

J

k

Effect of noise (multiple runs)

Memory size

Total saved time

Control intervals

WL change



Summary

DB2 self-tuning memory management

– Interconnection, heterogeneity, adaptation and robustness, cost of control

Constrained optimization with a linear feedback controller

Experimental assessment for OLTP and DSS workloads

self-tuning memory management of a database system

Documents