storage optimization strategies

48
Storage Optimization Strategies Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations Tom Bascom, White Star Software [email protected]

Upload: maxine-dyer

Post on 31-Dec-2015

19 views

Category:

Documents


0 download

DESCRIPTION

Storage Optimization Strategies. Techniques for configuring your Progress OpenEdge Database in order to minimize IO operations. Tom Bascom, White Star Software [email protected]. A Few Words about the Speaker. Tom Bascom ; Progress user & roaming DBA since 1987 President, DBAppraise, LLC - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Storage Optimization Strategies

Storage Optimization

StrategiesTechniques for configuring your Progress

OpenEdge Database in order to minimize IO operations

Tom Bascom, White Star [email protected]

Page 2: Storage Optimization Strategies

2

A Few Words about the Speaker

• Tom Bascom; Progress user & roaming DBA since 1987

• President, DBAppraise, LLC– Remote database management service for OpenEdge.– Simplifying the job of managing and monitoring the world’s

best business applications.– [email protected]

• VP, White Star Software, LLC– Expert consulting services related to all aspects of Progress and

OpenEdge.– [email protected]

Page 3: Storage Optimization Strategies

3

We Will NOT be Talking about:

• SANs• Servers• Operating systems• RAID levels• … and so forth.

Page 4: Storage Optimization Strategies

4

What Do We Mean by“Storage Optimization”?

• The trade press thinks it means BIG DISKS.• Your CFO thinks it means BIG SAVINGS.• Programmers think it means BIG DATABASES.• SAN vendors think it means BIG COMMISSIONS.

• DBAs seek the best possible reliability and performance at a reasonable cost.

Page 5: Storage Optimization Strategies

5

The Foundation of OpenEdge Storage

Optimization

Page 6: Storage Optimization Strategies

6

Type 2 Storage Areas

• Type 2 storage areas are the foundation for all advanced features of the OpenEdge database.

• Type 2 areas have cluster sizes of 8, 64 or 512.

• Cluster sizes of 0 or 1 are Type 1 areas.• Data blocks in Type 2 areas contain data from

just one table.

# misc32 storage area d “misc32_dat":11,32;8 .

Page 7: Storage Optimization Strategies

7

Only Read What You Need

• Because data blocks in Type 2 storage areas are “asocial”:– Locality of reference is leveraged more strongly.– Table-oriented utilities such as index rebuild,

binary dump and so forth know exactly which blocks they need to read and which blocks they do not need to read.

– DB features, such as the SQL-92 fast table scan and fast table drop, can operate much more effectively.

Page 8: Storage Optimization Strategies

8

MYTH

• Storage optimization is just for large tables.

• Type 2 storage areas are just for large tables.

Page 9: Storage Optimization Strategies

9

Truth

• Very small, yet active tables often dominate an application’s IO profile.

• And type 2 areas are a very powerful tool for addressing this.

Page 10: Storage Optimization Strategies

10

Case Study

A system with 30,000 record reads/sec:– The bulk of the reads were from one 10,000 record

table.– Coincidentally, Big B was set to 10,000.– That table was in a Type 1 area, and its records were

widely scattered.– Moving the table to a Type 2 Area patched the

problem. Only 2% of –B was now needed for this table!

– Performance improved dramatically.

Page 11: Storage Optimization Strategies

11

Type 2 Storage Area Usage

• Always use type 2 areas…

• … for areas that contain data, indexes or LOBS.

• The schema area is a type 1 area

Page 12: Storage Optimization Strategies

12

How to Define Your Storage Areas

Page 13: Storage Optimization Strategies

13

Use the Largest DB Block Size

• Large blocks reduce IO; fewer operations are needed to move the same amount of data.

• More data can be packed into the same space because there is proportionally less overhead.

• Because a large block can contain more data, it has improved odds of being a cache “hit.”

• Large blocks enable HW features to be leveraged: especially SAN HW.

Page 14: Storage Optimization Strategies

14

What about Windows?

• There are those who would say “except for Windows.”

• (Because Windows is a 4K-oriented OS.)

• I have had good success with Windows & 8k blocks.

• NTFS can be changed to use an 8k block…

Page 15: Storage Optimization Strategies

15

Use Many (Type 2) Storage Areas

• Do NOT assign tables to areas based on “function.”

• Instead group objects by common “technical attributes.”

• Create distinct storage areas for:– Each very large table– Tables with common Rows Per Block settings– Indexes versus data

Page 16: Storage Optimization Strategies

16

RecordFragmentation

Page 17: Storage Optimization Strategies

17

Fragmentation and Scatter

• “Fragmentation” is splitting records into multiple pieces.

• “Scatter” is the distance between (logically) adjacent records.

Page 18: Storage Optimization Strategies

18

$ proutil dbname –C dbanalys > dbname.dba…RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95------------------------------------------------------- Record Size (B) -Fragments- ScatterTable Records Size Min Max Mean Count Factor FactorPUB.APP_FLAGS 1676180 47.9M 28 58 29 1676190 1.0 1.9…

Fragmentation and Scatter

• “Fragmentation” is splitting records into multiple pieces.

• “Scatter” is the distance between (logically) adjacent records.

Page 19: Storage Optimization Strategies

19

Fragmentation and Scatter

• “Fragmentation” is splitting records into multiple pieces.

• “Scatter” is the distance between (logically) adjacent records.

$ proutil dbname –C dbanalys > dbname.dba…RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95------------------------------------------------------- Record Size (B) -Fragments- ScatterTable Records Size Min Max Mean Count Factor FactorPUB.APP_FLAGS 1676180 47.9M 28 58 29 1676190 1.0 1.9…

Page 20: Storage Optimization Strategies

20

Create Limit

• The minimum free space in a block• Provides room for routine record expansion

• OE10.2B default is 150 (4k & 8k blocks)• Must be smaller than the toss limit• Only rarely worth adjusting

Page 21: Storage Optimization Strategies

21

Toss Limit

• The minimum free space required to be on the “RM Chain”

• Avoids looking for space in blocks that don’t have much

• Must be set higher than Create Limit.• Default is 300 (4k & 8k blocks)• Ideally should be less than average row size• Only rarely worth adjusting

Page 22: Storage Optimization Strategies

22

Fragmentation, Create & Toss

Summary

Page 23: Storage Optimization Strategies

23

Create and Toss Limit Usage

Symptom ActionFragmentation occurs on updates to existing records.You anticipated one fragment, but two were created.

Increase Create Limit - or -Decrease Rows Per Block

There is limited (or no) fragmentation, but database block space is being used inefficiently, and records are not expected to grow beyond their original size.

Decrease Create Limit - or -Increase Rows Per Block

You have many (thousands, not hundreds) of blocks on the RM chain with insufficient space to create new records.

Increase Toss Limit

* Create and Toss limits are per area for Type 1 areas and per table for Type 2 areas.

Page 24: Storage Optimization Strategies

24

RowsPer

Block

Page 25: Storage Optimization Strategies

25

Why not “One Size Fits All”?

• A universal setting such as 128 rows per block seems simple.

• And for many situations it is adequate.• But…• Too large a value may lead to fragmentation and

too small to wasted space.• It also makes advanced data analysis more difficult.• And it really isn’t that hard to pick good values for

RPB.

Page 26: Storage Optimization Strategies

26

Set Rows Per Block Optimally

• Use the largest Rows Per Block that:– Fills the block– But does not unnecessarily fragment it

• Rough Guideline:– Next power of 2 after BlockSize / (AvgRecSize + 20)– Example: 8192 / (220 + 20) = 34, next power of 2 = 64

• Caveat: there are far more complex rules that can be used and a great deal depends on the application’s record creation & update behavior.

# misc32 storage area d “misc32_dat":11,32;8 .

Page 27: Storage Optimization Strategies

27

RPBExample

Page 28: Storage Optimization Strategies

28

Set Rows Per Block OptimallyBlkSz RPB Blocks Disk

(KB)Waste/

Blk%Used Actual

RPBIO/1,000

Recs1 4 3,015 3,015 124 86% 3 3334 4 2,500 10,000 2,965 23% 4 2504 8 1,250 5,000 2,075 46% 8 1254 16 627 2,508 295 92% 16 624 32 596 2,384 112 97% 17 598 4 2,500 20,000 7,060 11% 4 2508 16 625 5,000 4,383 44.76 16 628 32 313 2,504 806 90% 32 318 64 286 2,288 114 98% 35 298 128 285 2,280 109 98% 35 29

Original

Page 29: Storage Optimization Strategies

29

Set Rows Per Block OptimallyBlkSz RPB Blocks Disk

(KB)Waste/

Blk%Used Actual

RPBIO/1,000

Recs1 4 3,015 3,015 124 86% 3 3334 4 2,500 10,000 2,965 23% 4 2504 8 1,250 5,000 2,075 46% 8 1254 16 627 2,508 295 92% 16 624 32 596 2,384 112 97% 17 598 4 2,500 20,000 7,060 11% 4 2508 16 625 5,000 4,383 44.76 16 628 32 313 2,504 806 90% 32 318 64 286 2,288 114 98% 35 298 128 285 2,280 109 98% 35 29

Original

Oops!

Page 30: Storage Optimization Strategies

30

Set Rows Per Block OptimallyBlkSz RPB Blocks Disk

(KB)Waste/

Blk%Used Actual

RPBIO/1,000

Recs1 4 3,015 3,015 124 86% 3 3334 4 2,500 10,000 2,965 23% 4 2504 8 1,250 5,000 2,075 46% 8 1254 16 627 2,508 295 92% 16 624 32 596 2,384 112 97% 17 598 4 2,500 20,000 7,060 11% 4 2508 16 625 5,000 4,383 44.76 16 628 32 313 2,504 806 90% 32 318 64 286 2,288 114 98% 35 298 128 285 2,280 109 98% 35 29

Original

Suggested

Oops!

Page 31: Storage Optimization Strategies

31

Rows Per Block Caveats

• Blocks have overhead, which varies by storage area type, block size, Progress version and by tweaking the create and toss limits.

• Not all data behaves the same:– Records that are created small and that grow frequently

may tend to fragment if RPB is too high.– Record size distribution is not always Gaussian.

• If you’re unsure – round up!

Page 32: Storage Optimization Strategies

32

ClusterSize

Page 33: Storage Optimization Strategies

33

Blocks Per Cluster

• When a type 2 area expands it will do so a cluster at a time.

• Larger clusters are more efficient:– Expansion occurs less frequently.– Disk space is more likely to be contiguously

arranged.

# misc32 storage area d “misc32_dat":11,32;8 .

Page 34: Storage Optimization Strategies

34

Why not “One Size Fits All”?

• A universal setting such as 512 blocks per cluster seems simple…

Page 35: Storage Optimization Strategies

35

Set Cluster Size Optimally

• There is no advantage to having a cluster more than twice the size of the table.

• Except that you need a cluster size of at least 8 to be Type 2.

• Indexes are usually much smaller than data.• There may be dramatic differences in the size

of indexes even on the same table.

Page 36: Storage Optimization Strategies

36

Different Index Sizes

$ proutil dbname –C dbanalys > dbname.dba…RECORD BLOCK SUMMARY FOR AREA "APP_FLAGS_Dat" : 95------------------------------------------------------- Record Size (B) -Fragments- ScatterTable Records Size Min Max Mean Count Factor FactorPUB.APP_FLAGS 1676180 47.9M 28 58 29 1676190 1.0 1.9…

INDEX BLOCK SUMMARY FOR AREA "APP_FLAGS_Idx" : 96-------------------------------------------------------Table Index Flds Lvls Blks Size %Util FactorPUB.APP_FLAGS AppNo 183 1 3 4764 37.1M 99.9 1.0 FaxDateTime 184 2 2 45 259.8K 72.4 1.6 FaxUserNotified 185 2 2 86 450.1K 65.6 1.7

Page 37: Storage Optimization Strategies

37

LogicalScatter

Page 38: Storage Optimization Strategies

38

Logical Scatter Case Study

• A process reading approximately 1,000,000 records.

• An initial run time of 2 hours.– 139 records/sec.

• Un-optimized database.

Page 39: Storage Optimization Strategies

39

Perform IO in the Optimal Order

Table Index %Sequential %Idx Used DensityTable1 t1_idx1* 0% 100% 0.09

t1_idx2 0% 0% 0.09Table2 t2_idx1 69% 99% 0.51

t2_idx2* 98% 1% 0.51t2_idx3 74% 0% 0.51

4k DB Block

Page 40: Storage Optimization Strategies

40

Perform IO in the Optimal Order

Table Index %Sequential %Idx Used DensityTable1 t1_idx1* 0% 100% 0.09

t1_idx2 0% 0% 0.09Table2 t2_idx1 69% 99% 0.51

t2_idx2* 98% 1% 0.51t2_idx3 74% 0% 0.51

Table Index %Sequential %Idx Used DensityTable1 t1_idx1* 71% 100% 0.10

t1_idx2 63% 0% 0.10Table2 t2_idx1 85% 99% 1.00

t2_idx2* 100% 1% 1.00t2_idx3 83% 0% 0.99

4k DB Block

8k DB Block

Page 41: Storage Optimization Strategies

41

Perform IO in the Optimal Order

Table Index %Sequential %Idx Used DensityTable1 t1_idx1* 0% 100% 0.09

t1_idx2 0% 0% 0.09Table2 t2_idx1 69% 99% 0.51

t2_idx2* 98% 1% 0.51t2_idx3 74% 0% 0.51

Table Index %Sequential %Idx Used DensityTable1 t1_idx1* 71% 100% 0.10

t1_idx2 63% 0% 0.10Table2 t2_idx1 85% 99% 1.00

t2_idx2* 100% 1% 1.00t2_idx3 83% 0% 0.99

4k DB Block

8k DB Block

Oops!

Page 42: Storage Optimization Strategies

42

Logical Scatter Case Study

Block Size

Hit Ratio

%Sequential Block References IO Ops Time

4k 95 69 319,719 19,208 1204k 98 69 320,149 9,816 604k 99 69 320,350 6,416 408k 95 85 160,026 9,417 558k 98 85 159,805 4,746 308k 99 85 160,008 3,192 20

The process was improved from an initial runtime of roughly 2 hours (top line, in red) to approximately 20 minutes (bottom) by moving from 4k blocks and 69% sequential access at a hit ratio of approximately 95% to 8k blocks, 85% sequential access and a hit ratio of 99%.

Page 43: Storage Optimization Strategies

43

Avoid IO,But If You Must…

Page 44: Storage Optimization Strategies

44

… in Big B You Should Trust!

Layer Time # of Recs

# of Ops Cost per Op

Relative

Progress to –B 0.96 100,000 203,473 0.000005 1

-B to FS Cache 10.24 100,000 26,711 0.000383 75

FS Cache to SAN 5.93 100,000 26,711 0.000222 45

-B to SAN Cache* 11.17 100,000 26,711 0.000605 120

SAN Cache to Disk 200.35 100,000 26,711 0.007500 1500

-B to Disk 211.52 100,000 26,711 0.007919 1585

* Used concurrent IO to eliminate FS cache

Page 45: Storage Optimization Strategies

45

New Feature!

• 10.2B supports a new feature called “Alternate Buffer Pool.”

• This can be used to isolate specified database objects (tables and/or indexes).

• The alternate buffer pool has its own distinct –B.• If the database objects are smaller than –B, there is no

need for the LRU algorithm.• This can result in major performance improvements for

small, but very active, tables.• proutil dbname –C enableB2 areaname• Table and Index level selection is for Type 2 only!

Page 46: Storage Optimization Strategies

46

Conclusion

• Always use Type 2 storage areas.• Define your storage areas based on technical

attributes of the data.• Static analysis isn’t enough – you need to also

monitor runtime behaviors.• White Star Software has a great deal of experience

in optimizing storage. We would be happy to engage with any customer that would like our help!

Page 47: Storage Optimization Strategies

47

Thank You!

Page 48: Storage Optimization Strategies

48

Questions?