comp 530 database architecture and implementation

30
Department of Computer Science, HKUST Slide 1 COMP 530 Database Architecture and Implementation 1. Introduction

Upload: sheena

Post on 06-Feb-2016

66 views

Category:

Documents


0 download

DESCRIPTION

COMP 530 Database Architecture and Implementation. 1. Introduction. Why Learn DBMS?. You want to find a JOB !!!. Big Names in Database Systems. Web data management. Who Needs Database Systems. Typical Applications: Personnel management Inventory and purchase order - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 1

COMP 530 Database Architecture and Implementation

COMP 530 Database Architecture and Implementation

1. Introduction

Page 2: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 2

Why Learn DBMS?Why Learn DBMS?

You want to find a JOB !!!

Page 3: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 3

Big Names in Database SystemsBig Names in Database Systems

Company Product RemarksOracle Oracle 8i, 9i,

etc.World’s 2nd largest software company; CEO, Larry Ellison, world’s 2nd richest

IBM DB2, Universal Server

World’s 2nd largest after Informix acquisition in 2001

Microsoft SQL Server, Access

Access comes with MS Office

Sybase Adaptive Server

CEO John Chen, grown up in HK

Oracle MySQL Open Source, acquired by Sun in 2007, which was acquired by Oracle in 2009

Postgres “World’s most advanced Open Source DBMS”

Page 4: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 4

Who Needs Database SystemsWho Needs Database Systems

Corporate databases

Web data management

Typical Applications:Personnel managementInventory and purchase orderInsurance policies and customer data… …

Typical Applications:Web page managementPersonalize web pages… …

Page 5: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 5

There is a difference between DBMSs (Database Management Systems) and Databases

A few people work for Oracle, etc., to develop, enhance or maintain their DBMS products

Most people make a living working as DB designers, DB programmers or DB Administrators

There is a difference between DBMSs (Database Management Systems) and Databases

A few people work for Oracle, etc., to develop, enhance or maintain their DBMS products

Most people make a living working as DB designers, DB programmers or DB Administrators

Page 6: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 6

What is in a Database?What is in a Database?

• A database contains information about a particular enterprise or a particular application. E.g., a database for an enterprise may contain everything

needed for the planning and operation of the enterprise: customer information, employee information, product information, sales and expenses, etc.

You don’t have to be a company to use a database: you can store your personal information, expenses, phone numbers in a database (e.g., using Access on a PC).

As a matter of fact, you could store all data pertinent to a particular purpose in a database.

This usually means that a database stores data that are related to each other.

Page 7: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 7

Database DesignDatabase Design

HKUST

db designer 2

db designer 1

Academic Registration database:

students: names, address, …courses: course-no, course-names, …classroom: number, location, …

Estate Management database:

classroom: number, location, …office: number, location, …faculty-residence: building-no, … student-residence: hall-no, …

Page 8: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 8

Is a database the same as a bunch of files?

Is a database the same as a bunch of files?

• You can store data in a file or a set of files, but …– How do you input data and to get back the data from

the files?

• A database is managed by a DBMS.

Page 9: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 9

Before we have DBMSBefore we have DBMS

UserInventoryControl

Applications Data Files

UserCustomer

Order

Question: When a customer ordered 10 PC monitors, how many files do you have to update? Key issues: data sharing,

data redundancy

Page 10: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 10

A Simple ArchitectureA Simple Architecture

SQL

C/C++ Programs

DBMS

ApplicationsDatabases

Shared facilities:• Backup and recovery• Data storage and access

modules• Programming tools, etc.

Page 11: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 11

Purposes of Database SystemsPurposes of Database Systems

Database management systems were developed to handle the difficulties caused by different people writing different applications independently.

Page 12: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 12

• A DBMS attempts to resolve the following problems:– Data redundancy and inconsistency by keeping one copy

of a data item in the database – Difficulty in accessing data by provided query languages

and shared libraries– Data isolation (multiple files and formats)– Integrity problems by enforcing constraints (age > 0)– Atomicity of updates– Concurrent access by multiple users– Security problems

Specifically …Specifically …

Page 13: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 13

Data IndependenceData Independence

• One big problem in application development is the separation of applications from data

• Do I have change my program when I …– replace my hard drive?– store the data in a b-tree instead of a hash file?– partition the data into two physical files (or merge two

physical files into one)?– store salary as floating point number instead of integer?– develop other applications that use the same set of data?– add more data fields to support other applications?– … …

Independence between Data and Programs/Applications

Page 14: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 14

Data IndependenceData Independence

• Ability to modify a schema definition in one level without affecting a schema definition in the next higher level.

• The interfaces between the various levels and components should be well defined so that changes in some parts do not seriously influence others.

• Two levels of data independence:- Physical data independence- Logical data independence

Page 15: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 15

Data AbstractionData Abstraction

• The answer to the previous questions is to introduce levels of abstraction of indirection.

• Consider how do function calls allow you to change a part of your program without affecting other parts?

Main Program

functions

functions

data

Page 16: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 16

Three Levels of AbstractionThree Levels of Abstraction

view 1 view 2 ..……... view n

Logical view

Physical view

Payroll Inventory Sales

Company database

Files on disks

Page 17: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 17

view

logical

physical

View definitions

Logical schema

Physical schema

Application

Page 18: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 18

Three Levels of Abstraction (cont.)Three Levels of Abstraction (cont.)

• Physical level: describe how a record is stored on disks.•e.g., “Divide the customer records into 3 partitions and

store them on disks 1, 2 and 3.”• Logical level: describes data stored in database, and

the relationships among the data. Similar to defining a record type in Pascal or C:Type customer = record

name: string;street: string;city: integer; end;

• View level: Define a subset of the database for a particular application. Views can also hide information (e.g. salary) for security purposes.

Page 19: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 19

An Example of Data IndependenceAn Example of Data Independence

John Law … …1129Data on disk

programProgram accessing data directly has to know:• first 4 bytes is an ID number• next 10 bytes is an employee name

John Law … …1129Data on disk Employee:ID: integerName char(10)

Schema

program

DBMS

Page 20: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 20

Instances and SchemasInstances and Schemas

• Each level is defined by a schema, which defines the data at the corresponding level– A logical schema defines the logical structure of the

database (e.g., set of customers and accounts and the relationship between them)

– A physical schema defines the file formats and locations

• A database instance refers to the actual content of the database at a particular point in time. A database instance must conform to the corresponding schema

Page 21: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 21

Data ModelsData Models

• A collection of tools for describing:– data– data relationships– data semantics– data constraints

Page 22: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 22

Entity-Relationship ModelEntity-Relationship Model

• Example of entity-relationship model

CUSTOMER

social-security

customer-name

customer-street

customer-city

DEPOSITER ACCOUNT

account-number balance

Page 23: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 23

Relational ModelRelational Model

Example of tabular data in the relational model:

customer- name

social- security

customer- street

customer- city

account-number

JohnsonSmithJohnsonJonesSmith

192-83-7465019-28-3746192-83-7465321-12-3123019-28-3746

AlmaNorthAlmaMainNorth

Palo AltoRyePalo AltoHarrisonRye

A-101A-215A-201A-217A-201

account-number balanceA-101A-201A-215A-217

500900700750

Page 24: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 24

Data Definition Language (DDL)Data Definition Language (DDL)

• Specification notation for defining the database schema– Express what were in the previous two slides to the DBMS

in a formal language

• Data storage and definition language - special type of DDL in which the storage structure and access methods used by the database system are specified

Page 25: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 25

Data Manipulation Language (DML)Data Manipulation Language (DML)

• Language for accessing and manipulation the data organized by the appropriate data model

• Two classes of languages– Procedural - user specifies what data is required and

how to get those data.– Nonprocedural - user specifies what data is required

without specifying how to get those data

Page 26: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 26

Transaction ManagementTransaction Management

•A transaction is a collection of operations that performs a single logical function in database application

time

Transaction 1

Transaction 2

Conflicting read/write

Transaction 1

Page 27: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 27

• Transaction-management component ensures that the database remains in a consistent (correct) state despite system failures (e.g. power failures and operating system crashes) and transaction failures.

• Concurrency-control manager controls the interaction among the concurrent transactions, to ensure the consistency of the database.

Transaction Management (cont.)Transaction Management (cont.)

Page 28: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 28

Storage ManagementStorage Management

• A storage manager is a program module that provides the interface between the low-level data stored in the database and the application programs and queries submitted to the system.

• The storage manager is responsible for the following tasks:– interaction with the file manager– efficient storing, retrieving, and updating of data.

Page 29: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 29

Database Administrator (DBA)Database Administrator (DBA)

• Coordinates all the activities of the database system; the database administrator has good understanding of the enterprise’s information resources and needs.

• Database administrator’s duties include:– Schema definition– Specifying integrity constraints– Storage structure and access method definition– Schema and physical organization modification– Granting user authority to access the database– Acting as liaison with users– Monitoring performance and responding to changes in

requirements

Primary job of a databasedesigner

More systemoriented

Page 30: COMP 530 Database Architecture and Implementation

Department of Computer Science, HKUST Slide 30

Database UsersDatabase Users

• Users are differentiated by the way they expected to interact with the system

• Application programmers– Develop applications that interact with DBMS through DML calls

• Sophisticated users– form requests in a database query language– mostly one-time ad hoc queries

• End users– invoke one of the existing application programs (e.g., print

monthly sales report)– Interact with applications through GUI