tddd12 databasteknik: kursinformationtddd12/timetable/tddd12-fo01-introduktion_… · basic...

21
2008-04-01 1 IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering TDDD12 Databasteknik: Kursinformation

Upload: others

Post on 20-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

2008

-04-

011

IDA

, TD

DD

12, f

örel

äsni

ng 1

: In

trod

uktio

n oc

h E

R-m

odel

lerin

g

TD

DD

12 D

atab

aste

knik

: Kur

sinf

orm

atio

n

Page 2: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

1

Personal

2008-04-01 3

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Personal

Juha Takkinen, [email protected], kursansvarig, , föreläsare, labbassistent

Katarina Löfstrand, katlokursadministratör

José M. Peña, jospelabbassistent

Lena Strömbäck, lestrföreläsare, studierektor

Tommy Ellkvist, g-tomellabbassistent

Labbpolicy

2008-04-01 5

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

LiU: Disciplinary actions

Any kind of academic dishonesty, such as cheating, plagiarism, use of unauthorized assistance, fraud and failure to comply with University examination rules, may result in the filing of a complaint to the University Disciplinary Committee. The potential penalties include expulsion, suspension, and revocation of previously earned grade or degree. LiU Rules and regulations

2008-04-01 6

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Lab policyYou are expected to do the lab assignments by yourself. Merely copying others solutions will not be tolerated, even if you make cosmetic changes to the code/solution. If we suspect that this, or any other form of cheating, has happened we will report it to the disciplinary board of the university. Be prepared to be asked questions by your laboration assistant about detailed and specific code and also inquiries about why you have selected a specific solution. This applies to all lab group members.If you have problems meeting a deadline it is much better to talk to the instructor about it than to cheat.

(It is a shame that we have to say these things. They should be obvious.)

Är Google ettdatabassystem?

Page 3: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

2

2008-04-01 8

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Information retrieval (IR) on the Internet

1. Locate document collections2. Formulate query3. Judge relevance.

Traditional IR research and development has been concentrated on 2 and 3. The Internet (the web) requires 1 too.

2008-04-01 9

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

IRS, DBMS, and AI

IRS

DBMS

AI

Data object

document

tabell

logic expressions

Basicfunction

retrieval(probabilistic)retrieval(deterministic)inference

Databasesize

small to verylargesmall to verylargeusuallysmall

2008-04-01 10

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS

deterministicSQL> select * from kund

where nummer = 17;

meets the exact information need of the userCf.: search for memory stick in discussion forums

TDDD12 DatabasteknikFöreläsning 1: Introduktion

och ER-modellering

av Juha Takkinen 2008-04-01Institutionen för datavetenskap (IDA)

Linköpings universitet

Ljusbilderna baserade på Elmasri & Navathes original

2008-04-01 12

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Ch. 1: Databases and database users

Types of Databases and Database ApplicationsBasic DefinitionsTypical DBMS FunctionalityExample of a Database (UNIVERSITY)Main Characteristics of the Database ApproachDatabase Users

Advantages of Using the Database ApproachWhen Not to Use Databases

2008-04-01 13

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Types of Databases and Database Applications

Traditional Applications:Numeric and Textual Databases

More Recent Applications:Multimedia DatabasesGeographic Information Systems (GIS)Data WarehousesReal-time and Active DatabasesMany other applications

First part of book focuses on traditional applicationsA number of recent applications are described later in the book (for example, Chapters 24,26,28,29,30)

Page 4: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

3

2008-04-01 14

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Basic Definitions

Database:A collection of related data.

Data:Known facts that can be recorded and have an implicit meaning.

Mini-world:Some part of the real world about which data is stored in a database. For example, student grades and transcripts at a university.

Database Management System (DBMS):A software package/ system to facilitate the creation and maintenance of a computerized database.

Database System:The DBMS software together with the data itself. Sometimes, theapplications are also included.

2008-04-01

Simplified database system environment

2008-04-01 16

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Typical DBMS Functionality

Define a particular database in terms of its data types, structures, and constraintsConstruct or Load the initial database contents on a secondary storage mediumManipulate the database:

Retrieval: Querying, generating reportsModification: Insertions, deletions and updates to its contentAccessing the database through Web applications

Process and Share by a set of concurrent users and application programs – yet, keeping all data valid and consistent

2008-04-01 17

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Typical DBMS Functionality

Other features:Protection or Security measures to prevent unauthorized access“Active” processing to take internal actions on dataPresentation and Visualization of dataMaintaining the database and associated programs over the lifetime of the database application

Called database, software, and system maintenance

2008-04-01 18

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of a Database(with a Conceptual Data Model)

Mini-world for the example:Part of a UNIVERSITY environment.

Some mini-world entities:STUDENTsCOURSEs

SECTIONs (of COURSEs)(academic) DEPARTMENTsINSTRUCTORs

2008-04-01 19

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of a Database(with a Conceptual Data Model)

Some mini-world relationships:SECTIONs are of specific COURSEsSTUDENTs take SECTIONsCOURSEs have prerequisite COURSEsINSTRUCTORs teach SECTIONsCOURSEs are offered by DEPARTMENTsSTUDENTs major in DEPARTMENTs

Note: The above entities and relationships are typically expressed in a conceptual data model, such as the ENTITY-RELATIONSHIP data model (see Chapters 3, 4)

Page 5: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

4

2008-04-01

Example of a simple database

2008-04-01 21

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Main Characteristics of the Database Approach

Self-describing nature of a database system:A DBMS catalog stores the description of a particular database (e.g. data structures, types, and constraints)The description is called meta-data.This allows the DBMS software to work with different database applications.

Insulation between programs and data:Called program-data independence.Allows changing data structures and storage organization withouthaving to change the DBMS access programs.

2008-04-01 22

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of a simplified database catalog

Example of a simplified database catalog

2008-04-01 23

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Main Characteristics of the Database Approach (continued)

Data Abstraction: A data model is used to hide storage details and present the users with a conceptual view of the database.Programs refer to the data model constructs rather than data storage details

Support of multiple views of the data:Each user may see a different view of the database, which describes only the data of interest to that user.

2008-04-01 24

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Main Characteristics of the Database Approach (continued)

Sharing of data and multi-user transaction processing:Allowing a set of concurrent users to retrieve from and to update the database.Concurrency control within the DBMS guarantees that each transaction is correctly executed or abortedRecovery subsystem ensures each completed transaction has its effect permanently recorded in the databaseOLTP (Online Transaction Processing) is a major part of database applications. This allows hundreds of concurrent transactions toexecute per second.

2008-04-01 25

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Database Users

Users may be divided intoThose who actually use and control the database content, and those who design, develop and maintain database applications (called “Actors on the Scene”), andThose who design and develop the DBMS software and related tools, and the computer systems operators (called “Workers Behind the Scene”).

Page 6: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

5

2008-04-01 26

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Database Users

Actors on the sceneDatabase administrators:

Responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software and hardware resources, controlling its use and monitoring efficiency of operations.

Database Designers:Responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs.

2008-04-01 27

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Categories of End-users

Actors on the scene (continued)End-users: They use the data for queries, reports and some of them update the database content. End-users can be categorized into:

Casual: access database occasionally when neededNaïve or Parametric: they make up a large section of the end-user population.

They use previously well-defined functions in the form of “canned transactions” against the database.Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.

2008-04-01 28

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Categories of End-users (continued)

Sophisticated:These include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities.Many use tools in the form of software packages that work closely with the stored database.

Stand-alone:Mostly maintain personal databases using ready-to-use packaged applications.An example is a tax program user that creates its own internal database.Another example is a user that maintains an address book

2008-04-01 29

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Advantages of Using the Database Approach

Controlling redundancy in data storage and in development and maintenance efforts.

Sharing of data among multiple users.

Restricting unauthorized access to data.Providing persistent storage for program Objects

In Object-oriented DBMSs – see Chapters 20-22.

Providing Storage Structures (e.g. indexes) for efficient Query Processing.

2008-04-01 30

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Advantages of Using the Database Approach (continued)

Providing backup and recovery services.Providing multiple interfaces to different classes of users.Representing complex relationships among data.Enforcing integrity constraints on the database.Drawing inferences and actions from the stored data using deductive and active rules.

2008-04-01 31

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Additional Implications of Using the Database Approach

Potential for enforcing standards:This is very crucial for the success of database applications in large organizations. Standards refer to data item names, display formats, screens, report structures, meta-data (description of data), Web page layouts, etc.

Reduced application development time:Incremental time to add each new application is reduced.

Page 7: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

6

2008-04-01 32

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Additional Implications of Using the Database Approach (continued)

Flexibility to change data structures:Database structure may evolve as new requirements are defined.

Availability of current information:Extremely important for on-line transaction systems such as airline, hotel, car reservations.

Economies of scale:Wasteful overlap of resources and personnel can be avoided by consolidating data and applications across departments.

2008-04-01 33

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Historical Development of Database Technology

Early Database Applications:The Hierarchical and Network Models were introduced in mid 1960sand dominated during the seventies.A bulk of the worldwide database processing still occurs using these models, particularly, the hierarchical model.

Relational Model based Systems:Relational model was originally introduced in 1970, was heavily researched and experimented within IBM Research and several universities.Relational DBMS Products emerged in the early 1980s.

2008-04-01 34

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Historical Development of Database Technology (continued)

Object-oriented and emerging applications:Object-Oriented Database Management Systems (OODBMSs) were introduced in late 1980s and early 1990s to cater to the need ofcomplex data processing in CAD and other applications.

Their use has not taken off much.

Many relational DBMSs have incorporated object database concepts, leading to a new category called object-relational DBMSs(ORDBMSs)Extended relational systems add further capabilities (e.g. for multimedia data, XML, and other data types)

2008-04-01 35

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Historical Development of Database Technology (continued)

Data on the Web and E-commerce Applications:Web contains data in HTML (Hypertext markup language) with links among pages.This has given rise to a new set of applications and E-commerce is using new standards like XML (eXtendedMarkup Language). (see Ch. 27).Script programming languages such as PHP and JavaScript allow generation of dynamic Web pages that are partially generated from a database (see Ch. 26).

Also allow database updates through Web pages

2008-04-01 36

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

When not to use a DBMS

Main inhibitors (costs) of using a DBMS:High initial investment and possible need for additional hardware.Overhead for providing generality, security, concurrency control, recovery, and integrity functions.

When a DBMS may be unnecessary:If the database and applications are simple, well defined, and not expected to change.If there are stringent real-time requirements that may not be met because of DBMS overhead.If access to data by multiple users is not required.

2008-04-01 37

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

When not to use a DBMS

When no DBMS may suffice:If the database system is not able to handle the complexity of data because of modeling limitationsIf the database users need special operations not supported by the DBMS.

Page 8: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

7

2008-04-01 38

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Ch. 2: Database System Concepts and Architecture

Data Models and Their CategoriesHistory of Data ModelsSchemas, Instances, and StatesThree-Schema ArchitectureData IndependenceDBMS Languages and InterfacesDatabase System Utilities and ToolsCentralized and Client-Server ArchitecturesClassification of DBMSs

2008-04-01 39

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Data Models

Data Model:A set of concepts to describe the structure of a database, the operations for manipulating these structures, and certain constraints that the database should obey.

Data Model Structure and Constraints:Constructs are used to define the database structureConstructs typically include elements (and their data types) as well as groups of elements (e.g. entity, record, table), and relationships among such groupsConstraints specify some restrictions on valid data; these constraints must be enforced at all times

2008-04-01 40

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Data Models (continued)

Data Model Operations:These operations are used for specifying database retrievalsand updates by referring to the constructs of the data model.Operations on the data model may include basic model operations (e.g. generic insert, delete, update) and user-defined operations (e.g. compute_student_mean_grade, update_inventory)

2008-04-01 41

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Categories of Data Models

Conceptual (high-level, semantic) data models:Provide concepts that are close to the way many users perceive data.

(Also called entity-based or object-based data models.)

Physical (low-level, internal) data models:Provide concepts that describe details of how data is stored in the computer. These are usually specified in an ad-hoc manner through DBMS design and administration manuals

Implementation (representational) data models:Provide concepts that fall between the above two, used by many commercial DBMS implementations (e.g. relational data models used in many commercial systems).

2008-04-01 42

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Schemas versus Instances

Database Schema:The description of a database.Includes descriptions of the database structure, data types, and the constraints on the database.

Schema Diagram:An illustrative display of (most aspects of) a database schema.

Schema Construct:A component of the schema or an object within the schema, e.g., STUDENT, COURSE.

2008-04-01 43

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Schemas versus Instances

Database State:The actual data stored in a database at a particular moment in time. This includes the collection of all the data in the database.Also called database instance (or occurrence or snapshot).

The term instance is also applied to individual database components, e.g. record instance, table instance, entity instance

Page 9: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

8

2008-04-01 44

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Database Schema vs. Database State

Database State: Refers to the content of a database at a moment in time.

Initial Database State:Refers to the database state when it is initially loaded into the system.

Valid State:A state that satisfies the structure and constraints of the database.

2008-04-01 45

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Database Schema vs. Database State (continued)

DistinctionThe database schema changes very infrequently. The database state changes every time the database is updated.

Schema is also called intension.State is also called extension.

2008-04-01 46

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of a Database Schema

2008-04-01

Example of a database state

2008-04-01 48

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Three-Schema Architecture

Proposed to support DBMS characteristics of:Program-data independence.Support of multiple views of the data.

Not explicitly used in commercial DBMS products, but has been useful in explaining database system organization

2008-04-01 49

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

The three-schema architecture

Page 10: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

9

2008-04-01 50

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Data Independence

Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their associated application programs.

Physical Data Independence:The capacity to change the internal schema without having to change the conceptual schema.For example, the internal schema may be changed when certain file structures are reorganized or new indexes are created to improve database performance

2008-04-01 51

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Data Independence (continued)

When a schema at a lower level is changed, only the mappings between this schema and higher-level schemas need to be changed in a DBMS that fully supports data independence.The higher-level schemas themselves are unchanged.

Hence, the application programs need not be changed since they refer to the external schemas.

2008-04-01 52

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS Languages

Data Definition Language (DDL)Data Manipulation Language (DML)

High-Level or Non-procedural Languages: These include the relational language SQL

May be used in a standalone way or may be embedded in a programming language

Low Level or Procedural Languages:These must be embedded in a programming language

2008-04-01 53

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS Languages

Data Definition Language (DDL): Used by the DBA and database designers to specify the conceptual schema of a database.In many DBMSs, the DDL is also used to define internal and external schemas (views).In some DBMSs, separate storage definition language (SDL) and view definition language (VDL) are used to define internal and external schemas.

SDL is typically realized via DBMS commands provided to the DBA and database designers

2008-04-01 54

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS Languages

Data Manipulation Language (DML):Used to specify database retrievals and updatesDML commands (data sublanguage) can be embedded in a general-purpose programming language (host language), such as COBOL, C, C++, or Java.

A library of functions can also be provided to access the DBMS from a programming language

Alternatively, stand-alone DML commands can be applied directly (called a query language).

2008-04-01 55

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Types of DML

High Level or Non-procedural Language:For example, the SQL relational languageAre “set”-oriented and specify what data to retrieve rather than how to retrieve it. Also called declarative languages.

Low Level or Procedural Language:Retrieve data one record-at-a-time;

Constructs such as looping are needed to retrieve multiple records, along with positioning pointers.

Page 11: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

10

2008-04-01 56

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS Interfaces

Stand-alone query language interfacesExample: Entering SQL queries at the DBMS interactive SQL interface (e.g. SQL*Plus in ORACLE)

Programmer interfaces for embedding DML in programming languagesUser-friendly interfaces

Menu-based, forms-based, graphics-based, etc.

2008-04-01 57

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

DBMS Programming Language Interfaces

Programmer interfaces for embedding DML in a programming languages:

Embedded Approach: e.g. embedded SQL (for C, C++, etc.), SQLJ (for Java)Procedure Call Approach: e.g. JDBC for Java, ODBC for other programming languagesDatabase Programming Language Approach: e.g. ORACLE has PL/SQL, a programming language based on SQL; language incorporates SQL and its data types as integral component

2008-04-01 58

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

User-Friendly DBMS Interfaces

Menu-based, popular for browsing on the webForms-based, designed for naïve usersGraphics-based

(Point and Click, Drag and Drop, etc.)

Natural language: requests in written EnglishCombinations of the above:

For example, both menus and forms used extensively in Web database interfaces

2008-04-01 59

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Database System Utilities

To perform certain functions such as:Loading data stored in files into a database. Includes data conversion tools.Backing up the database periodically on tape.Reorganizing database file structures.Report generation utilities.Performance monitoring utilities.

Other functions, such as sorting, user monitoring, data compression, etc.

2008-04-01

Typical DBMS Compo-nentModules

2008-04-01 61

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

A Physical Centralized Architecture

Page 12: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

11

2008-04-01 62

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Basic two-tier Client-Server Architectures

Specialized Servers with Specialized functionsPrint serverFile serverDBMS server

Web serverE-mail server

Clients can access the specialized servers as needed

2008-04-01 63

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Logical two-tier client server architecture

2008-04-01 64

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Two-tier Client-Server Architecture

A client program may connect to several DBMSs,sometimes called the data sources.In general, data sources can be files or other non-DBMS software that manages data.Other variations of clients are possible: e.g., in some object DBMSs, more functionality is transferred to clients including data dictionary functions, optimization and recovery across multiple servers, etc.

2008-04-01 65

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Three-tier Client-Server Architecture

Common for Web applicationsIntermediate Layer called Application Server or Web Server:

Stores the web connectivity software and the business logic part of the application used to access the corresponding data from the database serverActs like a conduit for sending partially processed data between the database server and the client.

Three-tier Architecture Can Enhance Security: Database server only accessible via middle tierClients cannot directly access database server

2008-04-01 66

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Three-tier client-server architecture

2008-04-01 67

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Classification of DBMSs

Based on the data model usedTraditional: Relational, Network, Hierarchical.Emerging: Object-oriented, Object-relational.

Other classificationsSingle-user (typically used with personal computers)vs. multi-user (most DBMSs – and important in this course!).Centralized (uses a single computer with one database) vs. distributed (uses multiple computers, multiple databases)

Page 13: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

12

2008-04-01 68

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Variations of Distributed DBMSs(DDBMSs)

Homogeneous DDBMSHeterogeneous DDBMSFederated or Multidatabase SystemsDistributed Database Systems have now come to be known as client-server based database systems because:

They do not support a totally distributed environment, but rather a set of database servers supporting a set of clients.

2008-04-01 69

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

History of Data Models

Network ModelHierarchical ModelRelational ModelObject-oriented Data ModelsObject-Relational Models

2008-04-01 70

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

History of Data Models

Network Model:The first network DBMS was implemented by Honeywell in 1964-65 (IDS System).Adopted heavily due to the support by CODASYL (Conference on Data Systems Languages) (CODASYL -DBTG report of 1971).Later implemented in a large variety of systems - IDMS (Cullinet - now Computer Associates), DMS 1100 (Unisys), IMAGE (H.P. (Hewlett-Packard)), VAX -DBMS (Digital Equipment Corp., next COMPAQ, now H.P.).

2008-04-01 71

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of Network Model Schema

2008-04-01 72

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Network Model

Advantages:Network Model is able to model complex relationships and represents semantics of add/delete on the relationships.Can handle most situations for modeling using record types and relationship types.Language is navigational; uses constructs like FIND, FIND member, FIND owner, FIND NEXT within set, GET, etc.

Programmers can do optimal navigation through the database.

2008-04-01 73

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Network Model

Disadvantages:Navigational and procedural nature of processingDatabase contains a complex array of pointers that thread through a set of records.

Little scope for automated “query optimization”

Page 14: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

13

2008-04-01 74

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

History of Data Models

Hierarchical Data Model:Initially implemented in a joint effort by IBM and North American Rockwell around 1965. Resulted in the IMS family of systems.IBM’s IMS product had (and still has) a very large customer base worldwideHierarchical model was formalized based on the IMS systemOther systems based on this model: System 2k (SAS inc.)

2008-04-01 75

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Hierarchical Model

Advantages:Simple to construct and operateCorresponds to a number of natural hierarchically organized domains, e.g., organization (“org”) chartLanguage is simple:

Uses constructs like GET, GET UNIQUE, GET NEXT, GET NEXT WITHIN PARENT, etc.

Disadvantages:Navigational and procedural nature of processingDatabase is visualized as a linear arrangement of recordsLittle scope for "query optimization"

2008-04-01 76

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relational Model

Proposed in 1970 by E. F. Codd (IBM), first commercial system in 1981-82.Now in several commercial products (e.g. DB2, ORACLE, MS SQL Server, SYBASE, INFORMIX).Several free open source implementations, e.g. MySQL, PostgreSQLCurrently most dominant for developing database applications.SQL relational standards: SQL-89 (SQL1), SQL-92 (SQL2), SQL-99, SQL3, …Chapters 5 through 11 describe this model in detail

2008-04-01 77

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Object-oriented Data Models

Several models have been proposed for implementing in a database system. One set comprises models of persistent O-O Programming Languages such as C++ (e.g., in OBJECTSTORE or VERSANT), and Smalltalk (e.g., in GEMSTONE).Additionally, systems like O2, ORION (at MCC - then ITASCA), IRIS (at H.P.- used in Open OODB).Object Database Standard: ODMG-93, ODMG-version 2.0, ODMG-version 3.0.Chapters 20 and 21 describe this model.

2008-04-01 78

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Object-Relational Models

Most Recent Trend. Started with Informix Universal Server.Relational systems incorporate concepts from object databases leading to object-relational.Exemplified in the latest versions of Oracle-10i, DB2, and SQL Server and other DBMSs.Standards included in SQL-99 and expected to be enhanced in future SQL standards.Chapter 22 describes this model.

2008-04-01 79

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Ch. 3: Data Modeling Using the Entity-Relationship (ER) Model

Overview of Database Design ProcessExample Database Application (COMPANY)ER Model Concepts

Entities and AttributesEntity Types, Value Sets, and Key AttributesRelationships and Relationship TypesWeak Entity TypesRoles and Attributes in Relationship Types

ER Diagrams - NotationER Diagram for COMPANY SchemaAlternative Notations – UML class diagrams, others

Page 15: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

14

2008-04-01 80

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Overview of Database Design Process

Two main activities:Database designApplications design

Focus in this chapter on database designTo design the conceptual schema for a database application

Applications design focuses on the programs and interfaces that access the database

Generally considered part of software engineering

2008-04-01

Over-view of Data-base Design Process

2008-04-01 82

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example COMPANY Database

We need to create a database schema design based on the following (simplified) requirements of the COMPANY Database:

The company is organized into DEPARTMENTs. Each department has a name, number and an employee who manages the department. We keep track of the start date of the department manager. A department may have several locations.Each department controls a number of PROJECTs. Each project has a unique name, unique number and is located at a single location.

2008-04-01 83

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example COMPANY Database (Contd.)

We store each EMPLOYEE’s social security number, address, salary, sex, and birthdate.

Each employee works for one department but may work onseveral projects.We keep track of the number of hours per week that an employee currently works on each project.We also keep track of the direct supervisor of each employee.

Each employee may have a number of DEPENDENTs.For each dependent, we keep track of their name, sex, birthdate, and relationship to the employee.

2008-04-01 84

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

ER Model Concepts

Entities and AttributesEntities are specific objects or things in the mini-world that are represented in the database.

For example the EMPLOYEE John Smith, the Research DEPARTMENT, the ProductX PROJECT

Attributes are properties used to describe an entity.For example an EMPLOYEE entity may have the attributes Name, SSN, Address, Sex, BirthDate

A specific entity will have a value for each of its attributes.For example a specific employee entity may have Name='John Smith', SSN='123456789', Address ='731, Fondren, Houston, TX', Sex='M', BirthDate='09-JAN-55‘

Each attribute has a value set (or data type) associated with it – e.g. integer, string, subrange, enumerated type, …

2008-04-01 85

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Types of Attributes (1)

SimpleEach entity has a single atomic value for the attribute. For example, SSN or Sex.

CompositeThe attribute may be composed of several components. For example:

Address(Apt#, House#, Street, City, State, ZipCode, Country), orName(FirstName, MiddleName, LastName).Composition may form a hierarchy where some components are themselves composite.

Multi-valuedAn entity may have multiple values for that attribute. For example, Color of a CAR or PreviousDegrees of a STUDENT.

Denoted as {Color} or {PreviousDegrees}.

Page 16: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

15

2008-04-01 86

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Types of Attributes (2)

In general, composite and multi-valued attributes may be nested arbitrarily to any number of levels, although this is rare.

For example, PreviousDegrees of a STUDENT is a composite multi-valued attribute denoted by {PreviousDegrees (College, Year, Degree, Field)}Multiple PreviousDegrees values can existEach has four subcomponent attributes:

College, Year, Degree, Field

2008-04-01 87

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Example of a composite attribute

2008-04-01 88

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Entity Types and Key Attributes (1)

Entities with the same basic attributes are grouped or typed into an entity type.

For example, the entity type EMPLOYEE and PROJECT.

An attribute of an entity type for which each entity must have a unique value is called a key attribute of the entity type.

For example, SSN of EMPLOYEE.

2008-04-01 89

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Entity Types and Key Attributes (2)

A key attribute may be composite. VehicleTagNumber is a key of the CAR entity type with components (Number, State).

An entity type may have more than one key. The CAR entity type may have two keys:

VehicleIdentificationNumber (popularly called VIN)VehicleTagNumber (Number, State), aka license plate number.

Each key is underlined

2008-04-01 90

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Displaying an Entity Set: Entity Type CAR with two keys and a corresponding Entity Set

2008-04-01 91

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Entity Set

Each entity type will have a collection of entities stored in the database

Called the entity set

Previous slide shows three CAR entity instances in the entity set for CARSame name (CAR) used to refer to both the entity type and the entity setEntity set is the current state of the entities of that type that are stored in the database

Page 17: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

16

2008-04-01 92

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Initial Design of Entity Types for the COMPANY Database Schema

Based on the requirements, we can identify four initial entity types in the COMPANY database:

DEPARTMENTPROJECTEMPLOYEEDEPENDENT

Their initial design is shown on the following slideThe initial attributes shown are derived from the requirements description

Initial Design of Entity Types:EMPLOYEE, DEPARTMENT, PROJECT, DEPENDENT

2008-04-01 94

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Refining the initial design by introducing relationships

The initial design is typically not completeSome aspects in the requirements will be represented as relationshipsER model has three main concepts:

Entities (and their entity types and entity sets)Attributes (simple, composite, multivalued)

Relationships (and their relationship types and relationship sets)

We introduce relationship concepts next

2008-04-01 95

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relationships and Relationship Types (1)

A relationship relates two or more distinct entities with a specific meaning.

For example, EMPLOYEE John Smith works on the ProductXPROJECT, or EMPLOYEE Franklin Wong manages the Research DEPARTMENT.

Relationships of the same type are grouped or typed into a relationship type.

For example, the WORKS_ON relationship type in which EMPLOYEEs and PROJECTs participate, or the MANAGES relationship type in which EMPLOYEEs and DEPARTMENTsparticipate.

The degree of a relationship type is the number of participatingentity types.

Both MANAGES and WORKS_ON are binary relationships.

2008-04-01 96

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relationship instances of the WORKS_FOR N:1 relationship between EMPLOYEE and DEPARTMENT

2008-04-01 97

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relationship instances of the M:N WORKS_ON relationship between EMPLOYEE and PROJECT

Page 18: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

17

2008-04-01 98

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relationship type vs. relationship set (1)

Relationship Type:Is the schema description of a relationshipIdentifies the relationship name and the participating entity typesAlso identifies certain relationship constraints

Relationship Set:The current set of relationship instances represented in the databaseThe current state of a relationship type

2008-04-01 99

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Relationship type vs. relationship set (2)

Previous figures displayed the relationship setsEach instance in the set relates individual participating entities – one from each participating entity typeIn ER diagrams, we represent the relationship type as follows:

Diamond-shaped box is used to display a relationship typeConnected to the participating entity types via straight lines

2008-04-01 100

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Refining the COMPANY database schema by introducing relationships

By examining the requirements, six relationship types are identifiedAll are binary relationships (degree 2)Listed below with their participating entity types:

WORKS_FOR (between EMPLOYEE, DEPARTMENT)MANAGES (also between EMPLOYEE, DEPARTMENT)CONTROLS (between DEPARTMENT, PROJECT)WORKS_ON (between EMPLOYEE, PROJECT)SUPERVISION (between EMPLOYEE (as subordinate), EMPLOYEE (as supervisor))DEPENDENTS_OF (between EMPLOYEE, DEPENDENT)

2008-04-01

ER DIAGRAM – Relationship Types are:WORKS_FOR, MANAGES, WORKS_ON, CONTROLS, SUPERVISION, DEPENDENTS_OF

2008-04-01 102

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Discussion on Relationship Types

In the refined design, some attributes from the initial entity types are refined into relationships:

Manager of DEPARTMENT -> MANAGESWorks_on of EMPLOYEE -> WORKS_ONDepartment of EMPLOYEE -> WORKS_FORetc

In general, more than one relationship type can exist between the same participating entity types

MANAGES and WORKS_FOR are distinct relationship types between EMPLOYEE and DEPARTMENTDifferent meanings and different relationship instances.

2008-04-01 103

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Recursive Relationship Type

A relationship type which is with the same participating entity type in distinct rolesExample: the SUPERVISION relationshipEMPLOYEE participates twice in two distinct roles:

supervisor (or boss) rolesupervisee (or subordinate) role

Each relationship instance relates two distinct EMPLOYEE entities:

One employee in supervisor roleOne employee in supervisee role

Page 19: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

18

2008-04-01 104

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Weak Entity Types

An entity that does not have a key attributeA weak entity must participate in an identifying relationship type with an owner or identifying entity typeEntities are identified by the combination of:

A partial key of the weak entity typeThe particular entity they are related to in the identifying entity type

Example: A DEPENDENT entity is identified by the dependent’s first name, andthe specific EMPLOYEE with whom the dependent is relatedName of DEPENDENT is the partial keyDEPENDENT is a weak entity typeEMPLOYEE is its identifying entity type via the identifying relationship type DEPENDENT_OF

2008-04-01 105

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Constraints on Relationships

Constraints on Relationship Types(Also known as ratio constraints)Cardinality Ratio (specifies maximum participation)

One-to-one (1:1)One-to-many (1:N) or Many-to-one (N:1)Many-to-many (M:N)

Existence Dependency Constraint (specifies minimum participation) (also called participation constraint)

zero (optional participation, not existence-dependent)one or more (mandatory participation, existence-dependent)

2008-04-01 106

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Displaying a recursive relationshipIn a recursive relationship type:

Both participations are same entity type in different roles.For example, SUPERVISION relationships between EMPLOYEE (in role of supervisor or boss) and (another) EMPLOYEE (in role of subordinate or worker).

In following figure, first role participation labeled with 1 and second role participation labeled with 2.In ER diagram, need to display role names to distinguish participations.

2008-04-01 107

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

A Recursive Relationship Supervision

2008-04-01

Recursive Relationship Type is: SUPERVISION(participation role names are shown)

2008-04-01 109

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Attributes of Relationship types

A relationship type can have attributes:For example, HoursPerWeek of WORKS_ONIts value for each relationship instance describes the number of hours per week that an EMPLOYEE works on a PROJECT.

A value of HoursPerWeek depends on a particular (employee, project) combination

Most relationship attributes are used with M:N relationshipsIn 1:N relationships, they can be transferred to the entity type on the N-side of the relationship

Page 20: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

19

2008-04-01

Example Attribute of a Relationship Type: Hours of WORKS_ON

2008-04-01 111

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Notation for Constraints on Relationships

Cardinality ratio (of a binary relationship): 1:1, 1:N, N:1, or M:N

Shown by placing appropriate numbers on the relationship edges.

Participation constraint (on each participating entity type): total (called existence dependency) or partial.

Total shown by double line, partial by single line.

NOTE: These are easy to specify for Binary Relationship Types.

2008-04-01 112

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Alternative (min, max) notation for relationship structural constraints:

Specified on each participation of an entity type E in a relationship type RSpecifies that each entity e in E participates in at least min and at most max relationship instances in RDefault(no constraint): min=0, max=n (signifying no limit)Must have min≤max, min≥0, max ≥1Derived from the knowledge of mini-world constraintsExamples:

A department has exactly one manager and an employee can manage at most one department.

Specify (0,1) for participation of EMPLOYEE in MANAGESSpecify (1,1) for participation of DEPARTMENT in MANAGES

An employee can work for exactly one department but a department can have any number of employees.

Specify (1,1) for participation of EMPLOYEE in WORKS_FORSpecify (0,n) for participation of DEPARTMENT in WORKS_FOR

2008-04-01 113

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

The (min,max) notation for relationship constraints

Read the min,max numbers next to the entity type and looking away from the entity type

2008-04-01 114

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

COMPANY ER Schema Diagram using (min, max) notation

2008-04-01 115

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Summary of notation for ER diagrams

Page 21: TDDD12 Databasteknik: KursinformationTDDD12/timetable/tddd12-fo01-introduktion_… · Basic Definitions Database: A collection of related data. Data: Known facts that can be recorded

20

2008-04-01 116

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Alternative diagrammatic notation:UML class diagrams

Represent classes (similar to entity types) as large rounded boxes with three sections:

Top section includes entity type (class) nameSecond section includes attributesThird section includes class operations (operations are not in basic ER model)

Relationships (called associations) represented as lines connecting the classes

Other UML terminology also differs from ER terminologyUsed in database design and object-oriented software designUML has many other types of diagrams for software design (see Chapter 12)

2008-04-01 117

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

UML class diagram for COMPANY database schema

2008-04-01 118

IDA, TDDD12, föreläsning 1: Introduktion och ER-modellering

Other alternative diagrammatic notations