netcdf4 performance benchmark. part i will the performance in netcdf4 comparable with that in...

32
NetCDF4 NetCDF4 Performance Performance Benchmark Benchmark

Upload: alban-blake

Post on 24-Dec-2015

240 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

NetCDF4 NetCDF4 Performance Performance BenchmarkBenchmark

Page 2: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Part IPart I

Will the performance in netCDF4 Will the performance in netCDF4 comparable with that in netCDF3?comparable with that in netCDF3?

Page 3: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

ConfigurationsConfigurations

DatasetDataset 40 MB: 6 files40 MB: 6 files 1 MB: 6 files1 MB: 6 files

Storage LayoutStorage Layout ContiguousContiguous Chunked (HDF5 default cache size: 1 Chunked (HDF5 default cache size: 1

MB)MB) Chunked (HDF5 cache size: 64 MB)Chunked (HDF5 cache size: 64 MB)

System CacheSystem Cache

Page 4: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

System CacheSystem Cache

OnOn Use all caches and buffers provided by Use all caches and buffers provided by

kernelkernel DropDrop

““drop_caches” to read data from diskdrop_caches” to read data from disk ““fsync” to write data into diskfsync” to write data into disk

Page 5: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

10 cases10 casesDatasetDataset Storage LayoutStorage Layout System System

CacheCache

11 40 MB40 MB contiguouscontiguous onon

22 40 MB40 MB contiguouscontiguous dropdrop

33 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)

onon

44 40 MB40 MB chunked (64 MB chunked (64 MB cache)cache)

dropdrop

55 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)

onon

66 40 MB40 MB chunked (1 MB chunked (1 MB cache)cache)

dropdrop

77 1 MB1 MB contiguouscontiguous onon

88 1 MB1 MB contiguouscontiguous dropdrop

99 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)

onon

1010 1 MB1 MB chunked (1 MB chunked (1 MB cache)cache)

dropdrop

Page 6: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Default HyperslabDefault Hyperslab

One big hyperslab is selectedOne big hyperslab is selected

Page 7: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

1. Contiguous layout 1. Contiguous layout with cachewith cache

0 100 200 300 400 500 600

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 40 MB40 MB contiguouscontiguous onon

Page 8: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

2. Contiguous layout w/o 2. Contiguous layout w/o cachecache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 40 MB40 MB contiguouscontiguous dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Page 9: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

3. Chunked layout with 3. Chunked layout with cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

Page 10: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

4. Chunked layout w/o 4. Chunked layout w/o cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 cache size: 64 cache size: 64 MBMB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Page 11: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

5. Chunked layout with 5. Chunked layout with cachecache

DataseDatasett

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

Page 12: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

H5Pset_alloc_time(EARLH5Pset_alloc_time(EARLY)Y)

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400

1D

2D

3D

4D

5D

6D

num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3 chunked

H5Pset_alloc_time(EARLY)

Page 13: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

6. Chunked layout w/o 6. Chunked layout w/o cachecache

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Page 14: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

7. Contiguous layout 7. Contiguous layout with cachewith cache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 1 MB1 MB contiguouscontiguous onon

0 100 200 300 400 500 600

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

0 100 200 300 400

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 contiguous

netCDF4 contiguous

netCDF3

Page 15: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

8. Contiguous layout w/o 8. Contiguous layout w/o cachecache

DatasetDataset Storage Storage LayoutLayout

System System CacheCache

≈ ≈ 1 MB1 MB contiguouscontiguous dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Page 16: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

9. Chunked layout with 9. Chunked layout with cachecache

DataseDatasett

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 1 1 MBMB

chunked (chunked (HDF5 HDF5 default cache default cache size: 1 MBsize: 1 MB))

onon

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

0 100 200 300 400 500

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5 chunked

netCDF4 chunked

netCDF3

Page 17: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

10. Chunked layout w/o 10. Chunked layout w/o cachecache

DatasDatasetet

Storage LayoutStorage Layout System System CacheCache

≈ ≈ 1 1 MBMB

chunked (chunked (HDF5 HDF5 default cache size: default cache size: 1 MB1 MB))

dropdrop

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data read rate (MB/s)

HDF5

netCDF4

netCDF3

0 20 40 60 80 100

1D

2D

3D

4D

5D

6D

Num

ber

of d

imen

sion

s

data write rate (MB/s)

HDF5

netCDF4

netCDF3

Page 18: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Part IIPart II

Can I get better performance with Can I get better performance with netCDF4? If yes, under what netCDF4? If yes, under what circumstances can I get better circumstances can I get better performance?performance?

Page 19: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Non-contiguous AccessNon-contiguous Access

Logical layout for 2-dimensional Logical layout for 2-dimensional arraysarrays

256

256

163

84

16

1

240

Page 20: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Non-contiguous AccessNon-contiguous Access

Physical layoutPhysical layout

16384 non-adjacent data points

Chunk size [16384][1]

Chunk size [8192][1]

Chunk size [4096][1]

Page 21: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

11. Non-contiguous 11. Non-contiguous AccessAccess

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 16 16 MBMB

contiguous; contiguous; chunkedchunked

(default chunk (default chunk cache)cache)

dropdrop

0 100 200 300 400 500 600

netCDF3contiguous

netCDF4contiguous

chunked[16384][1]

chunked[8192][1]

chunked[4096][1]

Sto

rage

Lay

out

wall clock time to read one non-contiguous hyperslab (ms)

0 5 10 15 20 25

netCDF3contiguous

netCDF4contiguous

chunked[16384][1]

chunked [8192][1]

chunked [4096][1]

Sto

rage

Lay

out

wall clock time to write non-contiguous hyperslabs (s)

Page 22: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

12. Chunked layout with 12. Chunked layout with cachecache

DatasetDataset Storage LayoutStorage Layout System System CacheCache

≈ ≈ 40 40 MBMB

chunkedchunked

(chunk cache (chunk cache varies)varies)

onon

0

50

100

150

200

250

300

350

400

450

1 4 8 16 32 64

cache size for 5D dataset (MB)

data

writ

e ra

te (

MB

/s)

netCDF3

netCDF4

Page 23: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

13. Compression13. Compression

DatasetDataset Storage LayoutStorage Layout System System CacheCache

Radar Radar datadata

chunkedchunked

(default chunk (default chunk cache)cache)

dropdrop

0.0 0.5 1.0 1.5

tile1

tile2

tile4

Dat

aset

Nam

e

wall clock time to read radar data (second)

deflate compression level 1

without compression

0.0 0.5 1.0 1.5 2.0

tile1

tile2

tile4

Dat

aset

Nam

e

wall clock time to write radar data (second)

deflate compression level 1

without compression

Page 24: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

13. Compression13. Compression

Compression ratioCompression ratio

DatasDatasetet

UncompresUncompressedsed

CompressCompresseded

CompressiCompression Ratioon Ratio

Tile1Tile1 72,132,89272,132,892 3,432,5593,432,559 2121

Tile2Tile2 72,132,89272,132,892 5,129,4825,129,482 1414

Tile3Tile3 72,132,89272,132,892 3,069,2543,069,254 2323

Page 25: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

Part IIIPart III

Can netCDF4 performance be bad? Can netCDF4 performance be bad? How can I avoid the bad How can I avoid the bad performance?performance?

Page 26: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

14. Chunk size14. Chunk size

Too small chunk size is badToo small chunk size is bad Little bit smaller than Little bit smaller than (number of (number of

elements) / Nelements) / N is bad is bad

Page 27: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

14. Chunk size14. Chunk size

chunkchunk00

chunkchunk11

chunkchunk22

chunkchunk33

chunkchunk00

chunkchunk11

chunkchunk22

chunkchunk33

chunkchunk44

chunkchunk55

chunkchunk66

chunkchunk77

chunkchunk88

3162

791

3162

790

dataset

chunk

Page 28: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

36

38

40

42

44

46

48

8 16 32 50 128 200Number of elements for each dimension in a chunk

file

size

(M

B)

0

5

10

15

20

25

30

35

40

45

50

8 16 32 50 128 200Number of elements for each dimension in a chunk

data

writ

e ra

te (

MB

/s)

14. Chunk size14. Chunk size

DatasetDataset

≈ ≈ 64 MB64 MB

Storage LayoutStorage Layout

chunkedchunked

(default chunk (default chunk cache)cache)

System CacheSystem Cache

dropdrop

Page 29: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

0

20

40

60

80

100

120

140

160

316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk

file

size

(M

B)

14. Chunk size (more)14. Chunk size (more)

DatasetDataset

≈ ≈ 64 MB64 MB

Storage LayoutStorage Layout

chunkedchunked

(default chunk (default chunk cache)cache)

System CacheSystem Cache

dropdrop

0

5

10

15

20

25

30

35

40

45

316 527 791 1054 1581 2400 3162Number of elements for each dimension in a chunk

data

writ

e ra

te (

MB

/s)

n

n + 1

n - 1

Page 30: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

15. Many Hyperslab 15. Many Hyperslab selectionsselections

H5Pcreate()

H5Dopen()

Page 31: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

15. Many Hyperslab 15. Many Hyperslab selectionsselections

Page 32: NetCDF4 Performance Benchmark. Part I Will the performance in netCDF4 comparable with that in netCDF3? Will the performance in netCDF4 comparable with

ConclusionConclusion

The performance in netCDF4 is The performance in netCDF4 is comparable with that in netCDF3comparable with that in netCDF3

ImprovementImprovement Non-contiguous access patternNon-contiguous access pattern Adjusted cache sizeAdjusted cache size CompressionCompression

PitfallPitfall Small chunk sizeSmall chunk size Many small hyperslab selectionsMany small hyperslab selections