2002.10.14 - slide 1is 202 – fall 2002 prof. ray larson & prof. marc davis uc berkeley sims...

73
2002.10.14 - SLIDE 1 IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002 http://www.sims.berkeley.edu/academics/courses/ is202/f02/ SIMS 202: Information Organization and Retrieval Lecture 13: Intro to Database Design

Post on 19-Dec-2015

215 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 1IS 202 – FALL 2002

Prof. Ray Larson & Prof. Marc Davis

UC Berkeley SIMS

Tuesday and Thursday 10:30 am - 12:00 pm

Fall 2002http://www.sims.berkeley.edu/academics/courses/is202/f02/

SIMS 202:

Information Organization

and Retrieval

Lecture 13: Intro to Database Design

Page 2: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 2IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 3: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 3IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 4: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 4IS 202 – FALL 2002

Photo Metadata Matters

"Unlike people's recollections, photographs don't change. They don't lie." — Bill Simon

Page 5: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 5IS 202 – FALL 2002

Photo Project Feedback

IS202 PHOTO PROJECT STUDENT FEEDBACK

0 5 10 15 20 25 30

Need shared folders for facet groups

Frustrated with consolidation process

Need smaller facet groups

Liked reorg into facet groups

Offer help about group process

Show examples of other classification schemes

Need more discussion of assignments in class

Need more clarity about classification practices and principles

Assignments were too time-pressed

Assignments need to be clearer

Need better overview of whole project at the beginning

Learned a lot

Group experience was useful

Came to understand difficulty of classifying

Gro

up

sA

ssig

nm

en

tsL

ea

rne

d

Page 6: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 6IS 202 – FALL 2002

Where We Are Headed

• 450+ photos annotated in our consolidated metadata classification

• Searchable from SIMS web site in the Flamenco Browser

• Hopefully, some project teams will implement their applications as well– If not in 202, then in future SIMS projects

Page 7: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 7IS 202 – FALL 2002

Consolidated Photo Browser

http://fusion.sims.berkeley.edu/photo_project/photodatabase.cfm

Page 8: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 8IS 202 – FALL 2002

Flamenco Image Search

Page 9: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 9IS 202 – FALL 2002

Flamenco Image Search

Page 10: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 10IS 202 – FALL 2002

Assignment 6 Discussion

• Procedure for requesting additions to the consolidated classification (Monday through Wednesday only)

• Procedure for facet groups to recommend additions to the consolidated classification (Thursday through Friday)

• Procedure for the facet oversight group to decide additions to the consolidated classification (Friday through Monday)

Page 11: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 11IS 202 – FALL 2002

Photo Project Name Choices

• SIMS Snapshot• Digital Shoebox• Photo Pigeonhole • Pigeonhole • ImageKey• Picture Yourself• Pictures on the Wall• Distant Camera• Memory to Spare• Memories to Spare

Page 12: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 12IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 13: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 13IS 202 – FALL 2002

SGML/XML Structure

• An SGML document consists of three parts:– The SGML Declaration– The Document Type Definition (DTD)– The Document Instance

• An XML document REQUIRES only the document instance, but for effective processing a DTD is very important

• XML Schema provides an alternative to DTDs for XML applications

Page 14: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 14IS 202 – FALL 2002

DTD Components

• The major components of a DTD are:– Entity Declarations– Element Declarations– Attribute Declarations

Page 15: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 15IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 16: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 16IS 202 – FALL 2002

What is a Database?

Page 17: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 17IS 202 – FALL 2002

Files and Databases

• File: A collection of records or documents dealing with one organization, person, area or subject (Rowley)– Manual (paper) files– Computer files

• Database: A collection of similar records with relationships between the records (Rowley)– Bibliographic, statistical, business data,

images, etc.

Page 18: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 18IS 202 – FALL 2002

Database

• A Database is a collection of stored operational data used by the application systems of some particular enterprise (C.J. Date)– Paper “Databases”

• Still contain a large portion of the world’s knowledge

– File-Based Data Processing Systems• Early batch processing of (primarily) business data

– Database Management Systems (DBMS)

Page 19: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 19IS 202 – FALL 2002

Why DBMS?

• History– 50’s and 60’s all applications were custom

built for particular needs– File based– Many similar/duplicative applications dealing

with collections of business data– Early DBMS were extensions of programming

languages– 1970 - E.F. Codd and the Relational Model– 1979 - Ashton-Tate and first Microcomputer

DBMS

Page 20: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 20IS 202 – FALL 2002

File Based Systems

Naughty

NiceJust what asked for

CoalEstimation

DeliveryList

Application File

ToysAddresses

Toys

Page 21: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 21IS 202 – FALL 2002

From File Systems to DBMS

• Problems with file processing systems– Inconsistent data– Inflexibility– Limited data sharing– Poor enforcement of standards– Excessive program maintenance

Page 22: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 22IS 202 – FALL 2002

DBMS Benefits

• Minimal data redundancy• Consistency of data• Integration of data• Sharing of data• Ease of application development• Uniform security, privacy, and integrity controls• Data accessibility and responsiveness• Data independence• Reduced program maintenance

Page 23: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 23IS 202 – FALL 2002

Terms and Concepts

• Data independence– Physical representation and location of data

and the use of that data are separated• The application doesn’t need to know how or

where the database has stored the data, but just how to ask for it

• Moving a database from one DBMS to another should not have a material effect on application program

• Recoding, adding fields, etc. in the database should not affect applications

Page 24: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 24IS 202 – FALL 2002

Database Environment

CASE Tools

DBMS

UserInterface

ApplicationPrograms

Repository Database

Page 25: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 25IS 202 – FALL 2002

Database Components

DBMS===============

Design toolsTable CreationForm CreationQuery CreationReport Creation

Procedural language

compiler (4GL)=============

Run timeForm processorQuery processor

Report WriterLanguage Run time

UserInterface

Applications

ApplicationProgramsDatabase

Database contains:User’s DataMetadataIndexesApplication Metadata

Page 26: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 26IS 202 – FALL 2002

Types of Database Systems

• PC databases

• Centralized database

• Client/server databases

• Distributed databases

• Database models

Page 27: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 27IS 202 – FALL 2002

PC Databases

E.g.:AccessFoxProDbaseEtc.

Page 28: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 28IS 202 – FALL 2002

Centralized Databases

Central Computer

Page 29: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 29IS 202 – FALL 2002

Client Server Databases

NetworkClient

Client

Client

DatabaseServer

Page 30: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 30IS 202 – FALL 2002

Distributed Databases

computercomputer

computer

Location A

Location CLocation B

HomogeneousDatabases

Page 31: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 31IS 202 – FALL 2002

Distributed Databases

Local Network

DatabaseServer

Client

Client

CommServer

Remote Comp.

Remote Comp.

HeterogeneousOr FederatedDatabases

Page 32: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 32IS 202 – FALL 2002

Terms and Concepts

• Database application– An application program (or set of related

programs) that is used to perform a series of database activities:

• Create• Read• Update• Delete

On behalf of database users

Page 33: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 33IS 202 – FALL 2002

Terms and Concepts

• Database activities:– Create

• Add new data to the database

– Read• Read current data from the database

– Update• Update or modify current database data

– Delete• Remove current data from the database

Page 34: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 34IS 202 – FALL 2002

Terms and Concepts

• Enterprise– Organization

• Entity– Person, Place, Thing, Event, Concept...

• Attributes– Data elements (facts) about some entity– Also sometimes called fields or items or domains

• Data values– Instances of a particular attribute for a particular

entity

Page 35: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 35IS 202 – FALL 2002

Terms and Concepts

• Records– The set of values for all attributes of a

particular entity– AKA “tuples” or “rows” in relational DBMS

• File– Collection of records – AKA “Relation” or “Table” in relational DBMS

Page 36: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 36IS 202 – FALL 2002

Terms and Concepts

• Key– An attribute or set of attributes used to identify

or locate records in a file

• Primary Key– An attribute or set of attributes that uniquely

identifies each record in a file

Page 37: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 37IS 202 – FALL 2002

Terms and Concepts

• Models– (1) Levels or views of the Database

• Conceptual, logical, physical

– (2) DBMS types• Relational, Hierarchic, Network, Object-Oriented,

Object-Relational

Page 38: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 38IS 202 – FALL 2002

Models (1)

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 39: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 39IS 202 – FALL 2002

Data Models(2): History

• Hierarchical Model (1960’s and 1970’s)– Similar to data structures in programming

languages

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 40: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 40IS 202 – FALL 2002

Data Models(2): History

• Network Model (1970’s)– Provides for single entries of data and

navigational “links” through chains of data.

Subjects Books

Authors

Publishers

Page 41: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 41IS 202 – FALL 2002

Data Models(2): History

• Relational Model (1980’s)– Provides a conceptually simple model for data

as relations (typically considered “tables”) with all data visible

Book ID Title pubid Author id1 Introductio 2 12 The history 4 23 New stuff ab 3 34 Another title 2 45 And yet more 1 5

pubid pubname1 Harper2 Addison3 Oxford4 Que

Authorid Author name1 Smith2 Wynar3 Jones4 Duncan5 Applegate

Subid Subject1 cataloging2 history3 stuff

Book ID Subid1 22 13 34 24 3

Page 42: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 42IS 202 – FALL 2002

Data Models(2): History

• Object Oriented Data Model (1990’s)– Encapsulates data and operations as

“Objects”

Books(id, title)

Publisher SubjectsAuthors

(first, last)

Page 43: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 43IS 202 – FALL 2002

Data Models(2): History

• Object-Relational Model (1990’s)– Combines the well-known properties of the

Relational Model with such OO features as:• User-defined datatypes• User-defined functions• Inheritance and sub-classing

Page 44: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 44IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 45: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 45IS 202 – FALL 2002

Database System Life Cycle

Growth,Change, &

Maintenance6

Operations5

Integration4

Design1

Conversion3

PhysicalCreation

2

Page 46: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 46IS 202 – FALL 2002

Design

• Determination of the needs of the organization

• Development of the Conceptual Model of the database– Typically using Entity-Relationship

diagramming techniques

• Construction of a Data Dictionary

• Development of the Logical Model

Page 47: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 47IS 202 – FALL 2002

Physical Creation

• Development of the Physical Model of the Database– Data formats and types– Determination of indexes, etc.

• Load a prototype database and test

• Determine and implement security, privacy and access controls

• Determine and implement integrity constraints

Page 48: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 48IS 202 – FALL 2002

Conversion

• Convert existing data sets and applications to use the new database– May need programs, conversion utilities to

convert old data to new formats

Page 49: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 49IS 202 – FALL 2002

Integration

• Overlaps with Phase 3

• Integration of converted applications and new applications into the new database

Page 50: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 50IS 202 – FALL 2002

Operations

• All applications run full-scale

• Privacy, security, access control must be in place

• Recovery and Backup procedures must be established and used

Page 51: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 51IS 202 – FALL 2002

Growth, Change, and Maintenance

• Change is a way of life– Applications, data requirements, reports, etc.

will all change as new needs and requirements are found

– The Database and applications and will need to be modified to meet the needs of changes

Page 52: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 52IS 202 – FALL 2002

Another View of the Life Cycle

Operations5

Conversion3

PhysicalCreation

2Growth, Change

6

Integration4

Design1

Page 53: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 53IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 54: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 54IS 202 – FALL 2002

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 55: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 55IS 202 – FALL 2002

Entity

• An Entity is an object in the real world (or even imaginary worlds) about which we want or need to maintain information– Persons (e.g.: customers in a business,

employees, authors)– Things (e.g.: purchase orders, meetings,

parts, companies)

Employee

Page 56: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 56IS 202 – FALL 2002

Attributes

• Attributes are the significant properties or characteristics of an entity that help identify it and provide the information needed to interact with it or use it (this is the Metadata for the entities)

Employee

Last

Middle

First

Name SSN

Age

Birthdate

Projects

Page 57: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 57IS 202 – FALL 2002

Relationships

• Relationships are the associations between entities

• They can involve one or more entities and belong to particular relationship types

Page 58: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 58IS 202 – FALL 2002

Relationships

ClassAttendsStudent

PartSuppliesproject parts

Supplier

Project

Page 59: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 59IS 202 – FALL 2002

Types of Relationships

• Concerned only with cardinality of relationship

TruckAssignedEmployee

ProjectAssignedEmployee

ProjectAssignedEmployee

1 1

n

n

1

m

Chen ER notation

Page 60: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 60IS 202 – FALL 2002

Other Notations

TruckAssignedEmployee

ProjectAssignedEmployee

ProjectAssignedEmployee

“Crow’s Foot”

Page 61: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 61IS 202 – FALL 2002

Other Notations

TruckAssignedEmployee

ProjectAssignedEmployee

ProjectAssignedEmployee

IDEFIX Notation

Page 62: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 62IS 202 – FALL 2002

More Complex Relationships

ProjectEvaluationEmployee

Manager

1/n/n

1/1/1

n/n/1

ProjectAssignedEmployee 4(2-10) 1

SSN ProjectDate

ManagesEmployee

Manages

Is Managed By

1

n

Page 63: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 63IS 202 – FALL 2002

Weak Entities

• Owe existence entirely to another entity

Order-lineContainsOrder

Invoice #

Part#

Rep#

QuantityInvoice#

Page 64: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 64IS 202 – FALL 2002

Supertype and Subtype Entities

ClerkIs one ofSales-rep

Invoice

Other

Employee

Sold

Manages

Page 65: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 65IS 202 – FALL 2002

Many to Many Relationships

Employee

ProjectIsAssigned

ProjectAssignment

Assigned

SSN

Proj#

SSN

Proj#Hours

Page 66: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 66IS 202 – FALL 2002

Lecture Overview

• Photo Project Feedback and Assignment 6 Discussion

• Review– Metadata And Markup– XML DTD Construction– XML For Protocols And Metadata Languages

• Databases and Database Design• Database Life Cycle• ER Diagrams• Database Design

Page 67: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 67IS 202 – FALL 2002

Database Design Process

ConceptualModel

LogicalModel

External Model

Conceptual requirements

Conceptual requirements

Conceptual requirements

Conceptual requirements

Application 1

Application 1

Application 2 Application 3 Application 4

Application 2

Application 3

Application 4

External Model

External Model

External Model

Internal Model

Page 68: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 68IS 202 – FALL 2002

Requirements Analysis

• Conceptual Requirements– Systems Analysis Process

• Examine all of the information sources used in existing applications

• Identify the characteristics of each data element– Numeric– Text– Date/time– Etc.

• Examine the tasks carried out using the information

• Examine results or reports created using the information

Page 69: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 69IS 202 – FALL 2002

Conceptual Design

• Conceptual Model– Merge the collective needs of all applications– Determine what Entities are being used

• Some object about which information is to maintained

– What are the Attributes of those entities?• Properties or characteristics of the entity• What attributes uniquely identify the entity

– What are the Relationships between entities• How the entities interact with each other?

Page 70: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 70IS 202 – FALL 2002

Developing a Conceptual Model

• Overall view of the database that integrates all the needed information discovered during the requirements analysis

• Elements of the Conceptual Model are represented by diagrams, Entity-Relationship or ER Diagrams, that show the meanings and relationships of those elements independent of any particular database systems or implementation details

• Can also be represented using other modeling tools (such as UML)

Page 71: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 71IS 202 – FALL 2002

Logical Design

• Logical Model– How is each entity and relationship

represented in the Data Model of the DBMS• Hierarchic?• Network?• Relational?• Object-Oriented?

Page 72: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 72IS 202 – FALL 2002

Physical Design

• Internal Model– Choices of index file structure– Choices of data storage formats– Choices of disk layout

Page 73: 2002.10.14 - SLIDE 1IS 202 – FALL 2002 Prof. Ray Larson & Prof. Marc Davis UC Berkeley SIMS Tuesday and Thursday 10:30 am - 12:00 pm Fall 2002

2002.10.14 - SLIDE 73IS 202 – FALL 2002

Database Application Design

• External Model– User views of the integrated database – Making the old (or updated) applications work

with the new database design