physical database design barry floyd bus 498 advanced database management systems

38
Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Upload: charlene-townsend

Post on 12-Jan-2016

217 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Physical Database Design

Barry Floyd

BUS 498Advanced Database Management Systems

Page 2: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Introduction

The Physical Database Design Process

Goal is to translate our conceptual designs into physical reality

Draw on requirements analysis and our conceptual data model

Page 3: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Agenda

Data Volume and Usage AnalysisData Distribution Strategy

discuss this later in the quarterIndexesDenormalization

Page 4: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Overview

Important step in the database design process (also the last step)

Decisions made here impact ... data accessibility response times usability

Page 5: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Vocabulary

Data volume - how many recordsData usage - how often and in what

manner are the records used

Page 6: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

Use volume analysis to select physical storage devices estimate costs of storage

Page 7: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

TREATMENTTREATMENT PATIENTPATIENT PHYSICIANPHYSICIAN5050

CHARGECHARGE ITEMITEM500500

LOCATIONLOCATION100100

GIVENGIVEN

GIVENGIVEN

GIVENGIVEN

Page 8: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

TREATMENTTREATMENT PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE ITEMITEM500500

LOCATIONLOCATION100100

* Keep patient record active* Keep patient record active for 30 daysfor 30 days* Average length of stay * Average length of stay for a patient is 3 daysfor a patient is 3 days

100 X 30 / 3 => 1000100 X 30 / 3 => 1000

* Keep patient record active* Keep patient record active for 30 daysfor 30 days* Average length of stay * Average length of stay for a patient is 3 daysfor a patient is 3 days

100 X 30 / 3 => 1000100 X 30 / 3 => 1000

(10)(10)

(20)(20)

DERIVEDERIVE

Page 9: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE ITEMITEM500500

LOCATIONLOCATION100100

* Each patient has 4 treatments* Each patient has 4 treatments on average.on average.

1000 X 4 => 40001000 X 4 => 4000

* Each patient has 4 treatments* Each patient has 4 treatments on average.on average.

1000 X 4 => 40001000 X 4 => 4000

(10)(10)

(20)(20)(4)(4)

DERIVEDERIVE

Page 10: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE10,00010,000

ITEMITEM500500

LOCATIONLOCATION100100* Each patient has 10 charges* Each patient has 10 charges on average.on average.

1000 X 10 => 10,0001000 X 10 => 10,000

* Each patient has 10 charges* Each patient has 10 charges on average.on average.

1000 X 10 => 10,0001000 X 10 => 10,000

(20)(20)(4)(4)

DERIVEDERIVE

(20)(20)

(10)(10)

Page 11: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Volume Analysis

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE10,00010,000

ITEMITEM500500

LOCATIONLOCATION100100

(10)(10)

(20)(20)(4)(4)

(20)(20)

(10)(10)KNOW ...KNOW ...Number ofNumber ofrecords andrecords andrelationshipsrelationships

Page 12: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Data Usage Analysis

Want to identify major transactions and processes which hit on the database

Analyze each transaction and process to determine access paths used and frequency of use

Create composite map from individual analyses

Page 13: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Transaction Analysis FormTRANSACTION NUMBER MVCH-4TRANSACTION NAME: CREATE PATIENT BILLTRANSACTION VOLUME:AVERAGE 2/HR PEAK: 10/HR

PATIENTPATIENT10001000

CHARGECHARGE10,00010,000

ITEMITEM500500

(1)

(2) (3)

NO. NAME ACCESS TRAN PERIOD TYPE REF REF(1) ENTRY-PATIENT READ 1 10

Page 14: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Transaction Analysis Form

NO. NAME ACCESS TRAN PERIOD TYPE REF REF(1) ENTRY-PATIENT READ 1 10(2) PATIENT-CHARGE READ 10 100(3) CHARGE-ITEM READ 10 100

PATIENTPATIENT10001000

CHARGECHARGE10,00010,000

ITEMITEM500500

(1)

(2) (3)

Page 15: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Composite Usage Map

Determine how the data structures are accessed for each transaction and process include programs standard queries

programmedad hoc

Page 16: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Composite Usage Map

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE10,00010,000

ITEMITEM500500

LOCATIONLOCATION100100

(25)

(50)

(50)

(50)NUMBER ISPER HOURAT PEAK VOLUME

Page 17: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Composite Usage Map

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE10,00010,000

ITEMITEM500500

LOCATIONLOCATION100100

(75) (25) (30)

(200)

(20)

(50)

(50)

(100)

Page 18: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Composite Usage Map

TREATMENTTREATMENT40004000

PATIENTPATIENT10001000

PHYSICIANPHYSICIAN5050

CHARGECHARGE10,00010,000

ITEMITEM500500

LOCATIONLOCATION100100

(75) (25) (30)(25)

(200)

(20)

(50)

(50)

(50)(50)

(50)

(100)

Page 19: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Summary

Given volume and usage knowledge we can consider different physical implementation strategies, including ... INDEXES DENORMALIZATION CLUSTERING

Page 20: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Indexes

Purpose: To speed up access to a particular row or a group of rows in a table.

Also used to enforce uniquenessEliminates the necessity of re-sorting

the table each time we need to create a sequenced list

Page 21: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Indexes

Allen 3Brian 6Carole 7John 2Karen 5Marvin1Sharon 8Sue 4

1 Marvin …2 John ...3 Allen ...4 Sue ...5 Karen ...6 Brian ...7 Carole ...8 Sharon ...

Page 22: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Example

SELECT NAME, DEPT, RATING FROM EMP WHERE RATING = 10;

Indexing on RATING improves performance. Without an index, must do a full table scan.

Page 23: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Costs of an index?

Storage spaceMaintenance

Indexed must be changed for each add/delete or change in value on indexed field.

One benchmark ... insert into table w/o indexes, 0.11 seconds, w/ 8 indexes, 0.94 seconds.

Page 24: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Access Indexes

Automatically created on primary key.

You must create other indexes as needed.

Note, creating a unique index on a foreign key turns the relationship into a 1 - 1 relationship rather than a 1 - m relationship.

Let’s consider Oracle indexes and performance ...

Page 25: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Oracle Indexes

% Seconds8.5 0.66 12.03 35.7015.5 1.04 16.21 35.7025.2 1.54 25.45 35.7050.7 2.80 33.89 35.70100 5.72 87.23 35.70

SELECT COUNT(*)FROM EMPWHERE EMP_NO>0

SELECT EMP_NAMEFROM EMPWHERE EMP_NO>0

INDEX + TABLE

FULL TABLE SCAN

INDEXONLY

% OFFILEREAD

26,000 Rows, 7 Rows per Block

BREAK-EVEN

Page 26: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

% Seconds8.5 0.66 2.31 4.5215.5 1.05 4.01 4.5225.2 1.59 6.37 4.5250.7 2.91 12.69 4.52100 6.01 25.37 4.52

SELECT COUNT(*)FROM EMPWHERE EMP_NO>0

SELECT EMP_NAMEFROM EMPWHERE EMP_NO>0

INDEX + TABLE

FULL TABLE SCAN

INDEXONLY

% OFFILEREAD

26,000 Rows, 258 Rows per Block

BREAK-EVEN

Oracle Indexes

Page 27: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Rules of thumb

Use indexes generously for applications which are decision support/retrieval based.

Use indexes judiciously for transaction processing applications.

Page 28: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Places to use indexes

PRIMARY KEYFOREIGN KEYSNon Key attributes that are referred

to in qualification, sorting, and grouping (WHERE, ORDER BY, GROUP BY)

Page 29: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Denormalization

Goal is to reduce the number of physicals reads to the storage devices by reducing the number of joins.

Page 30: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Costs of Denormalization

Makes coding more complexOften sacrifices flexibilityWill speed up retrieval but slow

updates

Page 31: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Including children in the parent record

Multiple addresses in the personnel record Absolute number of children for a

parent is known (e.g., 2 addresses) The number won’t change over time The number is not very large

Page 32: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Clusters in Oracle

Clustering stores records from two tables into the same physical storage space Only useful for EQUI-JOINS Improves performance by 2-3 times

Page 33: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Storing most recent child data in the parent record

Multiple children, but children have an ordering (e.g., date of order) For example, perhaps storing amount of

last order. Amount of last dividend paid to a

particular account

Page 34: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Store running totals /Create extract tables

Store summary data from a child record Year to date sales

Create a summary table which contains aggregate values over some period (say, one month)

Page 35: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Duplicating a key beyond an immediate child record

ORDERS

PARTS

CLASS CLASS_ID

PART_ID,CLASS_ID

ORDER_ID,PART_ID,CLASS_IDADD THIS KEY

Page 36: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Consider SQL statement for previous example

SELECT PART_NO, ORDER_NO, CLASS, CLASS_DESCFROM CLASS C, PART P, ORDER OWHERE O.PART_NO = P.PART_NOAND P.CLASS = C.CLASS;

SELECT PART_NO, ORDER_NO, CLASS, CLASS_DESCFROM CLASS C,ORDER OWHERE O.CLASS = C.CLASS;

Page 37: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Record Partitioning

Breaking up a record into two parts

A,B,C,D,E,F,G

A,B,C,D

E,F,G

Page 38: Physical Database Design Barry Floyd BUS 498 Advanced Database Management Systems

Summary

Logical design gives you information about the ‘how’ to build the system.

Good physical design takes into account the performance of the final design … to know how best to do this task, you must understand how the system is being used!