1 systems analysis & design is 431: lecture 9 dn58412/is431/is431_sp15.htm csun information...

45
1 Systems Analysis & Design IS 431: Lecture 9 http://www.csun.edu/~dn58412/IS431/ IS431_SP15.htm CSUN Information Systems Database Design

Upload: cameron-bridges

Post on 26-Dec-2015

218 views

Category:

Documents


1 download

TRANSCRIPT

1

Systems Analysis & Design

IS 431: Lecture 9

http://www.csun.edu/~dn58412/IS431/IS431_SP15.htm

CSUN Information Systems

Database Design

IS 431 : Lecture 9 2

Database Design in SDLC

IS 431 : Lecture 9 3

Database Design

Elements of a DatabaseEvolution of Database SystemsDBMS ArchitectureRelational Data Model

and Relational Database Systems

IS 431 : Lecture 9 4

A field is the physical implementation of a data attribute. They are the smallest unit of meaningful data.

A key (identifier) is a field having unique values to identify one and only one record in a file.

A secondary key is an alternate identifier for a record.

A descriptive field is any other (non-key) field that stores business data.

Data Fields

IS 431 : Lecture 9 5

Records

A record is a collection of fields arranged in a predefined format.

– Fixed-length record structures

– Variable-length record structures

IS 431 : Lecture 9 6

Files

A file is the set of all occurrences of a given record structure.

– Types Master files Transaction files Document files Archival files Table lookup files Audit files

– File organization: index, sequential …

– File access: direct, sequential …

IS 431 : Lecture 9 7

Conventional Files vs. the Database

File – a collection of similar records.– Files are unrelated to each other except in the code of an

application program.

– Data storage is built around the applications that use the files.

Database – a collection of interrelated files– Records in one file (or table) are physically related to

records in another file (or table).

– Applications are built around the integrated database

IS 431 : Lecture 9 8

Files vs. Database

IS 431 : Lecture 9 9

Pros and Cons of Conventional Files

Pros: – Easy to design because of their single-application

focus

– Excellent performance due to optimized organization for a single application

Cons – Harder to adapt to sharing across applications

– Harder to adapt to new requirements

– Need to duplicate attributes in several files.

IS 431 : Lecture 9 10

Pros and Cons of Databases

Pros:– Data independence from applications increases adaptability

and flexibility– Superior scalability– Ability to share data across applications– Less, and/or controlled redundancy (total non-redundancy is

not achievable)Cons:

– More complex than file technology– Somewhat slower performance– Investment in DBMS and database experts– Need to adhere to design principles to realize benefits– Increased vulnerability due to consolidating data in a

centralized database

IS 431 : Lecture 9 11

Logical Structure of a Database

Inventory Sales Customer

CashReceipt

INVENTORY RECORD Item_No INT(5), non-null, index=itemx Description CHAR(20) Cost CURRENCY (6,2)

EXTERNAL LEVEL

CONCEPTUAL LEVEL

INTERNAL LEVEL

VIEW A VIEW B VIEW C

IS 431 : Lecture 9 12

Evolution of Database Systems

File Management (Flat File) SystemsHierarchical DatabasesNetwork DatabasesRelational DatabasesObject-Oriented DatabasesData Warehouse

IS 431 : Lecture 9 13

File Management Systems

EMPLOYEE UPDATE PROGRAM

FD EMPLOYEE

MASTER FILE

EMPLOYEE REPORT PROGRAM

FD

CHECK-WRITING PROGRAM FD

FD

TIMECARDFILE

IS 431 : Lecture 9 14

Hierarchical Databases

Car

Engine Body Chassis

LeftDoor

RightDoor

Hood Roof

Handle Window Lock

IS 431 : Lecture 9 15

Network Databases

Acme Mfg.

First Corp.

Size 4Widget

4DBolt

#11231 #11232 #11233 #11234 #11235

CUSTOMERS PRODUCTS

ORDERS

IS 431 : Lecture 9 16

Relational Databases

CUSTOMERS PRODUCTS

ORDERS

CUST ID

CUST ID

PRODUCT ID

PRODUCT ID

ORDER #

1 1

QUANTITY

M

M

IS 431 : Lecture 9 17

Object-Oriented Databases

CUSTOMERS

Add Customer

Drop Customer

Change Customer

PRODUCTSCUST ID

CUST NAME

ADDRESS

PRODUCT ID

PRICE

QTY-ON HAND

New Product

Buy Product

Sell Product

ORDERS

ORDER #

CUST ID

PRODUCT ID

QUANTITY

Take Order

Drop OrderMethod

Data

1 1

**

IS 431 : Lecture 9 18

Data Warehouse

IS 431 : Lecture 9 19

Difficulties of Non-Relational Tables

Update Anomaly: not changing all occurrence of a data item (in many places)

Insert Anomaly: add an invalid (null record) to the database

Delete Anomaly: not remove all info (in many places) about a deleted record

IS 431 : Lecture 9 20

Relational Data Model

Table (Relation)– Row: specific occurrence of an entity

– Column: characteristic of an entity, must be single-valued and same data type for all occurrences

Primary Key: unique identifier of an occurrence of an entity (entity integrity)

Foreign Key: to link tables together, must be null or corresponding to value of a primary key in another table (referential integrity)

IS 431 : Lecture 9 21

Requirements of Relational Data Model

Primary Key must be uniqueForeign Key must either be null or corresponding to

existing value of a primary key of another table (to link with this table)

Column describes a characteristic of an object identified by primary key

Column is single-value ( ex: not “$100, $250” )Values in rows of a specific column must be of the same

data type Column order or row order are unimportant

IS 431 : Lecture 9 22

Database Integrity

Entity integrity: error exists when identifiers (primary keys) are not unique to identify specific instances of the entity (2 customers have a same identification)

Referential integrity: error exists when a foreign key value in one table has no matching primary key value in the related table (Sell to non-existing customers)

Domain integrity:error exists when field value is outside the range (a pencil costs $100,000 !!!)

IS 431 : Lecture 9 23

Data Architecture

A business’s data architecture defines how that business will develop and use files and databases to store all of the organization’s data; the file and database technology to be used; and the administrative structure set up to manage the data resource.

Data is stored in some combination of:– Conventional files

– Operational databases (also called transactional databases)

– Data warehouses

– Personal databases

– Work group databases

IS 431 : Lecture 9 24

A legacyfile-based

informationsystem

(builtin-house)

File

File InformationSystem

(builtin-house)

InformationSystem

(builtin-house)

OperationalDatabase

File

File

Information System

(built in-house)

A legacyfile-based

informationsystem

(purchased)

File

File

File

InformationSystem

(purchased)

DataWarehouse

End-UserTools

End-UserApplications

Users andProgrammers

Users andProgrammers

Users andProgrammers

Users andProgrammers

Users

End-UserWork Group

Work-GroupDatabase

PersonalDB

OperationalDatabase

A Modern Data Architecture

IS 431 : Lecture 9 25

Database Architecture

Database architecture refers to the database technology including the database engine, database utilities, CASE tools, and database development tools.

A database management system (DBMS) is specialized software that is used to create, access, control, and manage the database. The core of the DBMS is a database engine.

– A data definition language (DDL) is that part of the engine used to physically define tables, fields, and structural relationships: (CREATE TABLE; ALTER TABLE …)

– A data manipulation language (DML) is that part of the engine used to create, read, update, and delete records in the database, and navigate between different files (tables) in the database: (INSERT; SELECT… FROM …WHERE…)

IS 431 : Lecture 9 26

Typical DBMS Architecture

IS 431 : Lecture 9 27

Goals of Database Design

A database should provide for efficient storage, update, and retrieval of data.

A database should be reliable—the stored data should have high integrity and promote user trust in that data.

A database should be adaptable and scalable to new and unforeseen requirements and applications.

IS 431 : Lecture 9 28

Normalization Revisited

First normal form (1NF) :– No repeating group of a same attribute– If not: create a new entity for this group.

Second normal form (2NF)– Attributes depend on the (composite) key, not part of it.– If not: create a new entity for these partial depended

attributesThird normal form (3NF)

– Attributes depend on the (primary) key only, not on each other

– If not: create new entity for these partial depended attributes

IS 431 : Lecture 9 29

Database Models

An Entity Relationship Diagram (ERD) is the logical model of the data requirements (data to be stored in the system and their relationships.

A Database Schema (physical data model) is the blueprint of the planned implementation of the logical model.

One can also use Relational Data Model as a blueprint of relational database design.

IS 431 : Lecture 9 30

Physical Data TypesLogical Data Typeto be stored in field)

Physical Data Type MS Access

Physical Data TypeMicrosoft SQL Server

Physical Data TypeOracle

Fixed length character data (use for fields with relatively fixed length character data)

TEXT CHAR (size) orcharacter (size)

CHAR (size)

Variable length character data (use for fields that require character data but for which size varies greatly--such as ADDRESS)

TEXT VARCHAR (max size) orcharacter varying (max size)

VARCHAR (max size)

Very long character data (use for long descriptions and notes--usually no more than one such field per record)

MEMO TEXT LONG VARCHAR orLONG VARCHAR2

Integer number NUMBER INT (size) orinteger orsmallinteger ortinuinteger

INTEGER (size) orNUMBER (size)

Decimal number NUMBER DECIMAL (size, decimal places) orNUMERIC (size, decimal places)

DECIMAL (size, decimal places) orNUMERIC (size, decimal places) orNUMBER

IS 431 : Lecture 9 31

Physical Data Types …

Logical Data Typeto be stored in field)

Physical Data Type MS Access

Physical Data TypeMicrosoft SQL Server

Physical Data TypeOracle

Financial Number CURRENCY MONEY see decimal number

Date (with time) DATE/TIME DATETIME orSMALLDATETIMEDepending on precision needed

DATE

Current time (use to store the data and time from the computer’s system clock)

not supported TIMESTAMP not supported

Yes or No; or True or False YES/NO BIT use CHAR(1) and set a yes or no domain

Image OLE OBJECT IMAGE LONGRAW

Hyperlink HYPERLINK VARBINARY RAW

Can designer define new data types?

NO YES YES

IS 431 : Lecture 9 32

Data Distribution

Data distribution analysis establishes which business locations need access to which logical data entities and attributes.

Centralization Horizontal distribution (Data partitioning: Rows or

Records) Vertical distribution (Data partitioning: Columns or

Attributes) Replication

IS 431 : Lecture 9 33

Data Distribution Options

Store all data on a single server (Centralization)Store specific tables on different servers.Store subsets of a specific table on different servers

– Subsets of columns (Vertical Partitioning)– Subsets of rows (Horizontal Partitioning)

Replicate (duplicate) specific tables or subsets on different servers for back up (Replication)

IS 431 : Lecture 9 34

Database Distribution Vertical partitioning

IS 431 : Lecture 9 35

Database Distribution Horizontal partitioning

IS 431 : Lecture 9 36

Database Distribution

IS 431 : Lecture 8 37

Database Distribution

IS 431 : Lecture 9 38

Relational Databases

Relational databases implement stored data in a series of two-dimensional tables that are “related” to one another via foreign keys.– The physical data model is called a schema.

– The DDL and DML for a relational database is called SQL (Structured Query Language).

– Triggers are programs embedded within a table that are automatically invoked by updates to another table.

– Stored procedures are programs embedded within a table that can be called from an application program.

IS 431 : Lecture 9 39

A Method for Database Design

1. Review the logical data model.

2. Create a table for each entity.

3. Create fields for each attribute.

4. Create an index for each primary and secondary key.

5. Create an index for each subsetting criterion.

6. Designate foreign keys for relationships.

7. Define data types, sizes, null settings, domains, and defaults for each attribute.

8. Create or combine tables to implement supertype/ subtype structures.

9. Evaluate and specify referential integrity constraints.

IS 431 : Lecture 9 40

From Logical Data Model

CustomerCust No

OrderOrder No

ProductProduct No

CUSTOMER (Cust No, ….)

ORDER (Order No, Cust No, ….)

PRODUCT (Product No,…)

ORDER-PRODUCT (OrderNo, ProductNo, …)

place contain(1,1) (1,M) (1,M) (1,N)

IS 431 : Lecture 9 41

to Physical Data Model (Relational Schema) …

IS 431 : Lecture 9 42

…to “MS Access” Implementation.

IS 431 : Lecture 9 43

Database Components

Database Engine

Database

Form Builder

ReportWriter

Interactive Query Tool

ApplicationProgram

DatabaseFront-end

DatabaseGatewayTo other

computer systems To other DBMS brands

IS 431 : Lecture 9 44

The Role of SQL

Database Engine

Database

Form Builder

ReportWriter

Interactive Query Tool

ApplicationProgram

DatabaseFront-end

DatabaseGatewayTo other

computer systems To other DBMS brands

SQL SQL SQL SQL SQL

SQL SQL

IS 431 : Lecture 9 45

Administrators

Data administrator is responsible for the data planning, definition, architecture, and management in the organization. (CIO)

Database administrators are responsible for the database technology, database design and construction, security, backup and recovery, and performance tuning. (Supervisors)