supplement 01 (b)database introduction-2 hsq - databases & sql and franchise colleges by mansha...

33
Supplement 01 (b ) Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Upload: edith-carpenter

Post on 20-Jan-2016

220 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

HSQ - DATABASES & SQL

And Franchise Colleges

By MANSHA NAWAZ

Supplement 01(b) Database-Introduction-2

Page 2: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

The Database Concept (4GL TOOLS & TECHNIQUES)

– This does not imply a computer system.– Is a well organised filing cabinet a database by this definition?

• A Database Management System (DBMS)– A sophisticated software development package capable of handling a systems

database stored needs.

• We are particularly interested in Relational Database Management Systems in this module (RDBMS).

• A Database

A data base is a collection on non-redundant data shareable

between different application systems. [Howe, D.R. 1989]

Page 3: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Application Systems Sharing Data

• The Applications (Application Systems / Programs etc.– The Admission System uses patient and medical staff data.– The Operation Scheduling Report uses operating theatre, patient and medical staff

data.– The Medical Staff Report uses medical staff data.

• The application systems share data.

AdmissionSystem

Operation Scheduling

Report

Medical Staff Report

Database

Page 4: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Data Models• A Data Model provides a particular way of thinking about data, at least in terms of its structure

Data Models include data descriptions, data relationships, data semantics, consistency constraints [Silberschatz et. al.]

Data model comprises three components: data structures, data manipulators, and general integrity rules [Codd (1970)]

• There are many types of data model.– Hierarchic - Network - Relational - Multidimensional

– Entity-Relationship - Object-Oriented - Multimedia - etc.

• Each database uses a definition language– imposes restrictions on

• what can be defined• how entities relate to each other

• In this module we are interested mainly in the Relational and Entity-Relationship data models.

• Why? The principles involved in Entity Relationship Modelling apply to all data models

Page 5: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

The Relational Model

Relational model first proposed in 1970 by Dr E F (Ted) Codd in the paper ‘A relational model of data for large shared data banks’.

– Achieve program/data independence by treat data in a disciplined way

– Has a mathematical basis – the term “relation” comes from this• Apply rigour of mathematics

• Use set theory

• Determining data structure.

– Data is stored in a structure of relations (tables) defined by a data definition language (DDL).

– The elements of data structure used in relational models are relations (tables), attributes, tuples (rows), and domains. [Rolland p39-44]

• Defining data integrity.

– Data integrity means that data remains stable, secure, and accurate.

– It is maintained by internal constraints known as integrity rules that are invisible to users. [Rolland p45-48]

Page 6: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

The Relational Database

• A relational database is made up of relations (tables) in which data are stored.

• A relation (table) is a 2-dimensional structure made up of attributes (columns) and tuples (rows).

Relation

• A relation is a table that obeys the following rules:– There are no duplicate rows in the table.– The order of the rows is immaterial.– The order of the columns is immaterial.– Each attribute value is atomic, ie each cell can contain one and only one data value.

Page 7: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

ANIMAL

ANAME AFAMILY WEIGHT

Candice Camel 1800

Zona Zebra 900

Sam Snake 5

Elmer Elephant 5000

Leonard Lion 1200

Example of a Table (Relation)

• Relations can be manipulated and changed using a data manipulation language (DML) that employs relational operators.

– These operators are based on the concepts of relational algebra.

• Information is represented as two dimensional tables as below.

ANIMAL TABLE

Page 8: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Tables and Keys – for relationships

• A primary key is a unique identifier for each row in a table. Can consist of one or more columns. Each table contains data about one entity.

• A foreign key is a column or columns in one table which reference(s) a primary key column or columns in another table.

• Values in a foreign key must match an existing value in the primary key or be NULL. This is known as the referential integrity rule.

• ANO in the ANIMAL-FOOD table is part of the primary key and also a foreign key.

ANIMAL

ANO ANAME AFAMILY WEIGHT

CA1 Candice Camel 1800

ZE4 Zona Zebra 900

SN1 Sam Snake 5

EL3 Elmer Elephant 5000

LI2 Leonard Lion 1200

ANIMAL-FOOD

ANO FOOD

CA1 Hay

CA1 Buns

ZE4 Brush

SN1 Mice

SN1 People

EL3 Leaves

LI2 People

LI2 Meat

Page 9: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Relational Database Terminology

• Relation a table with rows and columns• Tuple a row of a relation• Attribute a named column of a relation• Primary key a unique identifier for each row in a relation• Domain the set of allowable values for a column• Degree the number of columns in a relation• Cardinality the number of rows in a relation

Page 10: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Flight Table ExampleFlight: Flight# Origin Destination Arrival

BA143 NAP ROM 10.15 BA142 ROM NAP 10.22 KT222 LHR JFK 10.34 KT401 JFK DUL 10.45 KT402 DUL JFK 10.54 KT111 CCG MIA 11.06 KT112 MIA CCG 11.11 DE477 ATH CDG 11.34 DE478 CDG ATH 11.56 BA101 EDI LHR 12.04 BA102 LHR EDI 12.33

Table type usually written as follows: Flight: (Flight#, Origin, Destination, Arrival)

Page 11: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Flight# Origin Destination Arrival BA143 NAP ROM 10.15 BA142 ROM NAP 10.22 KT222 LHR JFK 10.34 KT401 JFK DUL 10.45 KT402 DUL JFK 10.54 KT111 CCG MIA 11.06 KT112 MIA CCG 11.11 DE477 ATH CDG 11.34 DE478 CDG ATH 11.56 BA101 EDI LHR 12.04 BA102 LHR EDI 12.33

Domain = set of values drawn upon by a particular attributeCardinality = No. of Rows in a relationDegree = No. of Columns in a relation

Flight:

Attribute Value, Data or Value

Column, Attribute, Type or Field

Q: Which is most likely to change?

Tuple, Row or Record

Key Field

Intension or

table type

Relational Database Terminology

Page 12: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Your turn – define the following terms!!

Client# Part# Qty-ordered DateC1 P1 25 22/08/99C2 P1 12 24/08/99C2 P2 8 24/08/99C2 P3 4 24/08/99C3 P2 3 25/08/99C4 P1 20 25/08/99

Order:

Value Row ColumnDomain Degree of Order IntensionAttribute type Cardinality of Order RecordAttribute Value Table Type FieldTuple Table Occurrence AttributeExtension Relation

Page 13: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Shop: Shop ID Area-code Location1 2 Edinburgh2 1 London3 1 London4 2 Edinburgh5 3 Birmingham6 4 Ipswich7 1 London

Why Tables ?

Page 14: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Primary Key Data Duplication

Shop: Shop ID Area-code 1 2 2 1 3 1 4 2 5 3 6 4 7 1

Area: Area-code Location 2 Edinburgh 1 London 3 Birmingham 4 Ipswich

Normalisation is the process used to make sure tables are non-redundant

Duplication of primary key data maintains the link (relationships) between tables (data)

Page 15: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Keys within the Relational Data Model

• The Primary Key is the field which uniquely identifies a record• The Primary Key (Unique Identifier) concept:

Example :

student (student# , name , …) (the # character means 'number’)

• If student# is the primary key then a particular student#, e.g. 'S4', can only occur once in that column of the table.

student# name S4 Ramesh S2 Peter S9 Anthony

S11 Priti

Page 16: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

• The Primary Key uniquely identifies a student and thus that student can only have one row in the student table.

• Breaking that rule ….

student# name S4 Ramesh S2 Peter S9 Anthony

S11 Priti S4 Fred

• The new row in the table does not make much sense.– The first row for 'S4' is sufficient to hold the name and we cannot allow a

second row with 'S4' as the primary key value.• Why?

Page 17: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

• Example

TASK (employee#, project#, role, supervisor, ours_allocated,

hours-so-far, hours-required , …)

employee# project# role supervisor hours-allocated hours-so-far hours-required

E2 P9 program E123 120 85 100

E2 P4 design E101 300 250 200

E101 P9 design E101 60 128 56

E22 P11 test E345 40 0 40

• This table has a composite primary key. – The primary key is composed of the three attributes

[employee#, project# , role].

• In each row, the composite of the values for these attributes must be unique. – For example, the first row has the values ['E2, P9, program'] for these

attributes. – No other row is allowed to have the same combination.

Page 18: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

employee# project# role supervisor hours-allocated hours-so-far

hours-required

E2 P9 program E123 120 85 100

E2 P4 design E101 300 250 200

E101 P9 design E101 60 128 56

E22 P11 test E345 40 0 40

E2 P9 program E99 12 0 2000

• Breaking that rule …

• Again this makes no sense or the basic design is wrong– Perhaps an employee can be re-allocated to a project, with a different

supervisor, to do more programming. – In this case the chosen primary key is wrong and needs the addition of an

extra attribute such as date.

• For example, [employee#, project# , role, start_date] might be an appropriate identifier.

Page 19: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Foreign Keys (Posted Identifier) concept

• Example:

student(student#, course#, student_name, …)

course (course#. course_name, …)

• course# is the identifier (primary key) of the course table.

• The course# is posted into the student table and is thus called a FOREIGN KEY (or posted identifier).

– Now for any student we can easily find the appropriate course# and look up futher details of that course in the course table is needed.

– This is easy to do in any relational database.

Page 20: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Candidate Keys (Candidate Identifier)

• Both National_Insurance# and emp# can be primary keys (unique identifiers) of employee.

• You choose one as the most appropriate from the two candidate (possible) keys.

• You could argue that the composite [name, address] is another candidate primary key.

– A favourite example

• Employee(emp#, name, address, National_Insurance#, ..)

Or

• Employee(National_Insurance# , name, address, emp#,..)

Page 21: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Shop:

Area:

Shop ID Area-code1 22 13 14 25 36 47 1

Area-code Location2 Edinburgh1 London3 Birmingham4 Ipswich

Examples - Primary and Foreign keys

Client# Part# Qty-ordered Date Client# Part# Qty-ordered Date

C1 P1 25 22/08/99 C2 P1 12 24/08/99 C2 P2 8 24/08/99 C2 P3 4 24/08/99 C3 P2 3 25/08/99 C4 P1 20 25/08/99

Visit# Patient# Doctor# Date V1 P1 D1 22/08/99 V2 P1 D1 24/08/99 V3 P2 D1 24/08/99 V4 P3 D1 24/08/99 V5 P1 D2 25/08/99

Order:

Visit:

Page 22: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Emp# Emp_name Dept# Status E1 Fred Brown Mtg Manager E2 Eve Munsen R&D Manager E3 Joyce Goldberg Admin G1 E4 Paul Samuels Mtg G4 E5 Paul Josephs R&D G3 E6 Terry Wain Production Manager

Man# Women# Date Man# Women# Date P1 P6 22/04/94 P2 P7 23/08/95 P2 P8 24/04/97 P3 P9 2/01/99 P4 P10 5/07/99 P5 P8 5/08/99

Employee:

Marriage:

Examples - Primary and Foreign keys

Page 23: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Anatomy of a table - a reminder A Table Occurrence: using a variation of the Library Copies table

access_no isbnx price now_price condition times-loaned

4,887,642 0-7131-3688-X £12.95 02.06.92 A2 4

4,887,657 0-7131-3688-X £12.95 17.09.91 B1 47

6,055,432 0-7248-1045-5 £37.65 12.04.92 A2 17

9,387,263 0-6542-1212-B £15.99 14.02.91 B2 37

7,365,241 0-2435-3468-V £27.40 19.11.91 A3 7

3,874,652 0-2435-3468-V £27.40 19.11.91 A1 11

Attribute: Example: date-purchased

Value: Example: 02.06.92

Table Example: COPIES(access_no, isbnx, price, now_price, condition, times-loaned)

– What is special about isbnx in this table?

Page 24: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

The 4 Rules for Normalised Tables [Rolland p72-]

• No row order significance.

• No column order significance.

• No multiple values at row/column intersections.

• No duplicate rows.

• Snapshots of table occurrences.– When we look at a paper copy of a table remember that the data in a real database

table can be expected to change all the time.

– The COPIES table could have 5 rows the first time we look and on another day there could be hundreds or thousands.

– Always assume a database table really has thousands of rows.

Page 25: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

The 4 rules for Normalised Tables broken

• No row order significance (Rule broken).

access_no isbnx price now_price condition times-loaned

4,887,642 0-7131-3688-X £12.95 02.06.92 A2 4

4,887,657 17.09.91 B1 47

6,055,432 0-7248-1045-5 £37.65 12.04.92 A2 17

9,387,263 0-6542-1212-B £15.99 14.02.91 B2 37

7,365,241 0-2435-3468-V £27.40 19.11.91 A3 7

3,874,652 19.11.91 A1 11

• If you swap the rows you lose information as copies are sometimes dependent on the row above for their ISBNX number.

Page 26: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

No Column Order Significance (Rule broken).

• The two columns with no attribute type shown are intended to indicate the date-purchased followed by the date-removed.

– The date-removed column would contain a significant number of NULLS (explained later).

• The two columns are now dependent on the column order for their meaning. – If you move the first date column to the end of the table then the meaning is lost.

• Clearly having each column with its own attribute type is simpler and makes the columns order independent of each other.

access_no isbnx price condition times-loaned

4,887,642 0-7131-3688-X £12.95 02.06.92 A2 4

4,887,657 0-7131-3688-X £12.95 17.09.91 28.07.93 B1 47

6,055,432 0-7248-1045-5 £37.65 12.04.92 A2 17

9,387,263 0-6542-1212-B £15.99 14.02.91 31.08.93 B2 37

7,365,241 0-2435-3468-V £27.40 19.11.91 A3 7

3,874,652 0-2435-3468-V £27.40 19.11.91 A1 11

Page 27: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

No Multiple Values at Row/column Intersections (Rule Broken).

• Or No Repeating Groups

• This is just too complicated – it makes searching and sorting difficult.

• Can you sort this into date order??

• Can it be easily searched on access_no ??

access_no isbnx price date condition times-loaned

4887642,4887657

0-7131-3688-X £12.95 02.06.92,17.09.91

A2, B1 4, 47

6,055,432 0-7248-1045-5 £37.65 12.04.92 A2 17

9,387,263 0-6542-1212-B £15.99 14.02.91 B2 37

7365241,3874652

0-2435-3468-V £27.40 19.11.91,19.11.91

A3, A1 7, 11

Page 28: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

access_no isbnx price date condition times-loaned

4,887,642 0-7131-3688-X £12.95 02.06.92 A2 4

4,887,657 0-7131-3688-X £12.95 17.09.91 B1 47

6,055,432 0-7248-1045-5 £37.65 12.04.92 A2 17

9,387,263 0-6542-1212-B £15.99 14.02.91 B2 37

7,365,241 0-2435-3468-V £27.40 19.11.91 A3 7

3,874,652 0-2435-3468-V £27.40 19.11.91 A1 11

4,887,642 0-7131-3688-X £12.95 02.06.92 A2 4

No Duplicate Rows (Rule broken).

• Why is this such a BAD idea?

• If you allow redundantly duplicated data in a real system, what will be the end result?

Page 29: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Domains

• An attribute cannot contain just any data. – For example we could have the attribute student_date_of_birth– Whilst '11-January-1980' might be a suitable value, 'FRED BLOGGS'

clearly is not - the wrong data type.

• So any value of student_date_of_birth at least should be a valid date.

– Other rules might apply to the attribute - dates before the year 1900 seem unlikely to be useful etc.

• The Domain concept carries this a bit further. – A Domain is the pool of values from which an attribute draws its actual

values.

Page 30: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

• However, 'X23Y B&&9' is also a string but is not a student name.(unless that student has particularly annoying parents).

• We could argue that there the finite list of possible student names is a subset of all possible random strings. – We can't predict what a student will be called.– Thus we have to implement the domain of student_name by using the

data type string.

• In other cases, for example the attribute, student_eye_colour, we could easily define a list of values that defines the full range of possibilities.

• We may say that the attribute student_name is of data type string. – So 'JOE BLOGGS' is a valid value.

Page 31: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

NULLS

• There are two basic types of NULL value 'not applicable' and 'not known'.

emp# emp-name age car-reg#

E4 D.Jones 34 F345DRT

E77 L.Smith 27

E9 J.Smith G467BBT

E2 N.Patel 55 K976BJT

• Every employee would have an age but it might not be known in a particular case.

• However, not every employee need own a car ( with a car-reg#).

Page 32: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

Summary

• Introduction to Database Concepts

• Introduction to Data Modelling

• Introduction to Databases and Redundancy.

• Relational Data Model

• Terminology associated with Relational Data Model.

• Duplicated Data

• Primary and Foreign Keys

• Duplicated Data (Foreign key to Primary key references to link data in tables)

but not redundant.

• Additional supplementary material on Normalisation available

Page 33: Supplement 01 (b)Database Introduction-2 HSQ - DATABASES & SQL And Franchise Colleges By MANSHA NAWAZ Supplement 01(b) Database-Introduction-2

Supplement 01 (b) Database Introduction-2

END OF LECTURE