tm 7-1 copyright © 1999 addison wesley longman, inc. physical database design

31
TM 7- TM 7-1 Copyright © 1999 Addison Wesley Longman, Inc. Copyright © 1999 Addison Wesley Longman, Inc. Physical Database Design

Upload: osborn-welch

Post on 30-Dec-2015

221 views

Category:

Documents


1 download

TRANSCRIPT

TM 7-TM 7-11Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Physical Database Design

TM 7-TM 7-22Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Introduction

• The purpose of physical database design is to translate the logical description of data into the technical specifications for storing and retrieving data.

• The goal is to create a design for storing data that will provide adequate performance and insure database integrity, security and recoverability.

TM 7-TM 7-33Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Inputs to Physical Design

• Normalized relations.• Volume estimates.• Attribute definitions.• Data usage: entered, retrieved, deleted, updated.• Response time requirements.• Requirements for security, backup, recovery,

retention, integrity.

• DBMS characteristics.

TM 7-TM 7-44Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Physical Design Decisions

• Specifying attribute data types.

• Modifying the logical design.

• Specifying the file organization.

• Choosing indexes.

TM 7-TM 7-55Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Composite usage map (Pine Valley Furniture Company)

TM 7-TM 7-66Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Designing Fields

• Choosing data type.

• Coding, compression, encryption.

• Controlling data integrity.– Default value.– Range control.– Null value control.– Referential integrity.

Pg 257

TM 7-TM 7-77Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Example code-look-up table (Pine Valley Furniture Company)Pg 259

TM 7-TM 7-88Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Designing Fields

• Handling missing data.– Substitute an estimate of the missing value.– Trigger a report listing missing values.– In programs, ignore missing data unless the

value is significant.

Pg 260

TM 7-TM 7-99Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Physical Records

• Physical Record: A group of fields stored in adjacent memory locations and retrieved together as a unit.

• Page: The amount of data read or written in one I/O operation.

• Blocking Factor: The number of physical records per page.

Pg 260

TM 7-TM 7-1010Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Designing Physical Files

• Physical File: A file as stored on the disk.

• Constructs to link two pieces of data:– Sequential storage.– Pointers.

• File Organization: How the files are arranged on the disk.

• Access Method: How the data can be retrieved based on the file organization.

Pg 267

TM 7-TM 7-1111Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Denormalization

• One-to-one relationship. Fig. 7-3a.

• Many-to-many relationship. Fig. 7-3b.

• One-to-many relationship. Fig. 7-3c.

TM 7-TM 7-1212Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

A possible denormalization situation: reference data

TM 7-TM 7-1313Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Partitioning

• Horizontal Partitioning: Distributing the rows of a table into several separate files.

• Vertical Partitioning: Distributing the columns of a table into several separate files.– The primary key must be repeated in each file.

TM 7-TM 7-1414Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Partitioning

• Advantages of Partitioning:– Records used together are grouped together.

– Each partition can be optimized for performance.

– Security, recovery.

– Partitions stored on different disks: contention.

– Take advantage of parallel processing capability.

• Disadvantages of Partitioning:– Slow retrievals across partitions.

– Complexity.

TM 7-TM 7-1515Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Sequential File Organization

• Records of the file are stored in sequence by the primary key field values.

Pg 268-9

TM 7-TM 7-1616Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Comparisons of file organizations(a) Sequential

TM 7-TM 7-1717Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Indexed File Organizations

• Index concept.

• Indexed-sequential file organization: The records are stored sequentially by primary key values and there is an index built over the primary key field.

• B-tree index.

• Bitmap index, Fig. 7-7.

Pg 269

TM 7-TM 7-1818Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

(b)Indexed

TM 7-TM 7-1919Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Hashed File Organization

• Hashing Algorithm: Converts a primary key value into a record address.

• Division-remainder method.

TM 7-TM 7-2020Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

(c ) Hashed

TM 7-TM 7-2121Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Optimization Decisions• Denormalization• Partitioning• Selection of File Organization• Clustering Files• Placement of Indexes• Derived Columns• Repeating Groups Across Columns• Other

TM 7-TM 7-2222Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Selection of File Organization

• Sequential

• Indexed– Indexed-Sequential– Indexed (Non-Sequential)

• Hashed

TM 7-TM 7-2323Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Clustering Files

• In some relational DBMSs, related records from different tables can be stored together in the same disk area.

TM 7-TM 7-2424Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Rules for Using Indexes

1. Use on larger tables.

2. Index the primary key of each table.

3. Index search fields.

4. Fields in SQL ORDER BY and GROUP BY commands.

5. When there are >100 values but not when there are <30 values.

TM 7-TM 7-2525Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Rules for Using Indexes

6. DBMS may have limit on number of indexes per table and number of bytes per indexed field(s).

7. Null values will not be referenced from an index.

8. Use indexes heavily for non-volatile databases; limit the use of indexes for volatile databases.

TM 7-TM 7-2626Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Rules for Adding Derived Columns

• Use when aggregate values are regularly retrieved.

• Use when aggregate values are costly to calculate.

• Permit updating only of source data.• Create triggers to cascade changes from

source data.• Do not put derived rows in same table.

TM 7-TM 7-2727Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Rules for Storing Repeating Groups Across Columns

• Consider storing repeating groups across columns rather than down rows when:– The repeating group has a fixed number of

occurrences, each of which has a different meaning or

– The entire repeating group is normally accessed and updated as one unit.

TM 7-TM 7-2828Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Other Rules

• Consider contriving a shorter field or selecting another candidate key to substitute for a long, multi-field primary key (and all associated foreign keys.)

• Consider replacing a foreign key with an alternate when the alternate is what you are looking for (to avoid join.)

TM 7-TM 7-2929Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

RAID with four disks and striping

TM 7-TM 7-3030Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Database Architectures

• Fig. 7-6.

• Hierarchical

• Network

• Relational

• Object-oriented

• Multidimensional

TM 7-TM 7-3131Copyright © 1999 Addison Wesley Longman, Inc.Copyright © 1999 Addison Wesley Longman, Inc.

Query Optimizer Factors• Type of Query

– Highly selective.– All or most of the records of a file.

• Unique fields• Size of files• Indexes• Join Method

– Nested-Loop– Merge-Scan (Both files must be ordered or indexed on the join

columns.)