task scheduling for highly concurrent analytical and transactional main-memory workloads iraklis...

Post on 28-Dec-2015

229 Views

Category:

Documents

1 Downloads

Preview:

Click to see full reader

TRANSCRIPT

Task Scheduling for Highly Concurrent Analytical and Transactional

Main-Memory Workloads

Iraklis Psaroudakis (EPFL), Tobias Scheuer (SAP AG), Norman May (SAP AG), Anastasia Ailamaki (EPFL)

1

Scheduling for high concurrency

2

Queries >> H/W contexts

Limited number of H/W contexts

How should the DBMS use available CPU resources?

Scheduling for mixed workloads

3

OLTP OLAP

Short-lived

Reads & updates

Scan-light

Long-running

Read-only

Scan-heavy

How to schedule highly concurrent mixed workloads?

Single thread Parallelism

Contention in highly concurrent situations

12

1

32

12

1

32

Scheduling tactics• OS scheduler

4

1

2

3

Time

12

1

32

TimeTime

Context switchCache thrashing

• Admission control

We need to avoid both underutilization and overutilization

# Threads

Time

# H/W contexts

Overutilization

} underutilization

} overutilization

Coarse granularity of control

Task scheduling• A task can contain any code

5

run() { ...}

• One worker thread per core processing tasksSocket 1 Socket 2Task queues

Provides a solution to efficiently utilize CPU resources

• OLAP queries can parallelize w/o overutilization

• Distributed queues to minimize sync contention• Task stealing to fix imbalance

Task scheduling problems for DBMS

• OLTP tasks can block– Problem: under-utilization of CPU resources– Solution: flexible concurrency level

• OLAP queries can issue an excessive number of tasks in highly concurrent situations– Problem: unnecessary scheduling overhead– Solution: concurrency hint

6

7

Outline• Introduction• Flexible concurrency level• Concurrency hint• Experimental evaluation with SAP HANA• Conclusions

Fixed concurrency level

8

A fixed concurrency level is not suitable for DBMS

• Typical task scheduling:

Bypasses the OS scheduler

• OLTP tasks may blockUnderutilization

Fixed

Flexible concurrency level• Issue additional workers when tasks block• Cooperate with the OS scheduler

9

Concurrency level = # of worker threads

Active Concurrency level = # of active worker threads

OS

Active concurrency level = # H/W contexts

The OS schedules the threads

Task SchedulerTask Scheduler

Worker states

10

Inactive Workers

Watchdog:– Monitoring, statistics, and takes actions– Keeps active concurrency level ≈ # of H/W contexts

Blockedin syscall

Inactiveby user

Waitingfor a task

Parkedworkers

Activeworkers

Watchdog Otherthreads

We dynamically re-adjust the scheduler's concurrency level

11

Outline• Introduction• Flexible concurrency level• Concurrency hint• Experimental evaluation with SAP HANA• Conclusions

Partitionable operations• Can be split in a variable number of tasks

12

Calculates its task granularity

Σ1 ≤ # tasks

≤ # H/W contexts

• Problem: calculation independent of the system’s concurrency situation- High concurrency: excessive number of tasks

Unnecessary scheduling overheadWe should restrict task granularity under high concurrency

Partition 1

Partition 2

Partition 3

Final result

Restricting task granularityExisting frameworks for data parallelism

– Not straightforward for a commercial DBMS– Simpler way?

13

free worker threads = max(0, # of H/W contexts - # active worker threads)

The concurrency hint serves as an upper bound for # tasks

concurrency hint = exponential moving average of free worker threads

High latencyLow scheduling overhead

Higher throughput

Concurrency hint

14

Low concurrency situations

Lightweight way to restrict task granularity under high concurrency

High concurrency situations

Concurrency hint # H/W contexts

Concurrency hint 1

ΣΣ Σ

Σ ΣLow latency

15

Outline• Introduction• Flexible concurrency level• Concurrency hint• Experimental evaluation with SAP HANA• Conclusions

Experimental evaluation with SAP HANA• TPC-H SF=10 • TPC-H SF=10 + TPC-C WH=200• Configuration:

– 8x10 Intel Xeon E7-8870 2.40 GHz, with hyperthreading, 1 TB RAM, 64-bit SMP Linux (SuSE) 2.6.32 kernel

– Several iterations. No caching. No thinking times.

• We compare:– Fixed (fixed concurrency level)– Flexible (flexible concurrency level)– Hint (flexible concurrency level + concurrency hint)

16

TPC-H – Response time

17

32160

288416

544672

800928

0

25

50

75

100

125

150 FixedFlexibleHint

Number of concurrent queries

Resp

onse

tim

e (s

ec)

32 128 256 512 1024

11.2%3.5%

Task granularity can affect OLAP performance by 11%

Series10

1000000...

2000000...

3000000...

Fixed Flexible Hint

Inst

ructi

ons

retir

edTPC-H - Measurements

18

Series10

200000

400000

600000

800000

1000000

1200000

1400000

# of

task

s (x

1000

0)

0100000200000300000400000500000600000700000800000900000

1000000

Cont

ext S

witc

hes

(x10

000)

Unnecessary overhead by too many tasks under high concurrency

020000

4000060000

80000

100000

120000

1400000

20406080

100120140160

020004000600080001000012000

Active workers Waiting tasks

Time (sec)

# of

H/W

con

text

s

# of waitin

g tasks (x1000)

TPC-H - Timelines

19

020000

4000060000

80000

100000

1200000

20406080

100120140160

0100020003000400050006000

Time (sec)#

of H

/W c

onte

xts

# of waitin

g tasks (x1000)

Fixed Hint

16 32 640

50100150200250300350400450500

0

500

1000

1500

2000

2500

3000

3500

FixedFlexibleHintFixedFlexibleHint

TPC-H Concurrent Clients

TPC-

H T

hrou

ghpu

t (q/

min

)

TP

C-C

Th

rough

pu

t (tpm

C)

TP

C-H

TPC-

CTPC-H and TPC-C

20

Throughput experiment– Variable TPC-H clients = 16-64. TPC-C clients = 200.

Conclusions• Task scheduling for

– Resources management

• For DBMS– Handle tasks that block

• Solution: flexible concurrency level– Correlate task granularity of analytical queries with

concurrency to avoid unnecessary scheduling overhead• Solution: concurrency hint

21

Thank you! Questions?

top related