fundamentals/icy: databases 2010/11 revision week john barnden professor of artificial intelligence...

26
Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham, UK

Upload: abel-parks

Post on 20-Jan-2016

215 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Fundamentals/ICY: Databases2010/11

REVISION WEEK

John BarndenProfessor of Artificial Intelligence

School of Computer ScienceUniversity of Birmingham, UK

Page 2: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

EXAM(May/June;

fine detail for Resit may differ)

See info re past exams on my module webpages.

Page 3: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Structure of Exam

One and a half hours.

Four questions.

DO THREE.

Material on mathematical relations and relational algebra is only used in one of the four Questions. (That Question may also involve other material.)

Page 4: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

The Remaining Three Questions

They will range from precise technical things to more general considerations.

SQL query expressions required in several parts of the three questions. Amount to about 28% of the marks for those three questions.

Extra credit of 8% is available in one of the questions for providing SQL create expressions.

Extra credit of 8% is available in one of the questions for providing creative ERD notation suggestions.

Page 5: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Material Needed for Exam

Anything in the required textbook reading may be useful in the exam, except of course that a detailed memory of specific, data-full examples is not expected, and except for some SQL detail (see next slide).

You need to study all Additional Notes, except that: The exam will not rely on the treatment of functional dependencies and

normalization there (in the 1st of the three parts in the Week 9 batch).

The exam will not rely on material on physical design (in the 3rd of the three parts in the Week 9 batch).

You need to study all Exercise Answer Notes.

The content of Funmi’s lecture on experiences in industry will not be relied upon (but of course may help you in overall understanding of some issues).

Page 6: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Textbook Parts (R&C 2009)

See my module website (top page).

On SQL: the exam doesn’t rely on fine detail beyond what’s in the handouts (and occasional lectures).

Page 7: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

MODULE REVIEW

Page 8: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

What We Mainly Studied

The nature of relational databases, the central modern type of database. Entity types represented as tables, holding relations.

Some basic mathematical concepts underpinning relational databases, and useful also in many other branches of CS.

Key aspects of how to develop the conceptual/logical design of relational databases.

In particular, how to achieve certain types of good structuring, to help achieve certain types of correctness and efficiency.

How to create and query databases using a particular database language, PostgreSQL (a version of SQL: very widely used in various forms).

Page 9: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Initial Considerations What a database is, and how it relates to other types

of data structure/repository in CS.

Data integrity, data redundancy, data anomalies.

Associative links between parts of a database, as opposed to pointing.

Ways data is stored/linked in physical human media such as diaries, address books and timetables.

Various complications in tables in human documents.

Restricted type of table used in relational DBs.

Page 10: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Entities and Relationships Any relational DB as consisting of entity types

and relationships between them: Entity Relationship Model (ERM) in general.

Specific ERMs for specific applications, and distinction from Entity Relationship Diagrams (ERDs).

Entity Types as represented by tables.

The question of what types of thing should correspond to “entity types,” and hence tables, depends on the application and your design judgment.

Page 11: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Attribute Determination and Keys

One or more attributes determining another attribute. Can also be described as that attribute being functionally dependent on the former attributes.

Various notions of key, especially superkeys, candidate keys and primary keys …

And foreign keys as the (main) implementation of relationships between entity types.

Page 12: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Strength and Weakness Strong and weak relationships. (Also called identifying

and non-identifying relationships respectively.)

Weak entity types, as defined according to the strength of their relationships to other entity types and existence-dependence with respect to those types.

Depiction of strength or weakness in different styles of ERD.

Page 13: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Connectivity, Cardinality & Participation

Connectivity: uniqueness or multiplicity of entities at either end of a relationship.

Cardinality: precise numerical info about how many entities allowed or required at either end of a relationship.

Participation: optionality or mandatoriness of a relationship, in either direction.

Overlap between these notions.

Notation in ERDs.

Page 14: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Table Representation of Relationships of Different Connectivities

Basic case is 1:M non-recursive. (Recursive is when two or more entity types in a relationship are the same.)

1:M recursive—can often be handled within a single table.

M:N, M:N:P, etc. standardly handled by breaking down into two, three, etc. 1:M relationships going to a new entity type: a “bridging” or “linking” type.

Symmetric relationships (e.g. marriage). Special problems of redundancy arise. Can be avoided by a special implementation involving two extra tables, together serving a bridging function.

Symmetric relationships are recursive and are either 1:1 or M:N.

Page 15: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

1-1 Relationships

If you find yourself using a 1-1 relationship, especially when unchanging, mandatory in both directions, and non-recursive: ask yourself whether the two entity types could be combined into one without causing difficulty.

E.g., if there were an unchanging 1-1 relationship, mandatory both ways round, between employees and phone stations, you could probably combine into one entity type without difficulty, increasing efficiency of some operations.

But if the relationship changes frequently, easier to have 2 types/tables.

And if the relationship is optional in at least one direction, using one type leads to more wasted space.

Page 16: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

1-1 Relationships, contd

Some good, standard uses of 1-1 relationships:

[As just mentioned] Cases where there is a significant amount of change or optionality.

Subtype/supertype relationships: naturally 1-1; useful to keep the types separate.

Symmetric 1-1 relationships such as marriage: only one entity type anyway.

Cases where one of the two types is hidden from some users by access controls.

Page 17: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Other Representation Issues

Multivalued attributes. OK in themselves in early stages of design, but should eventually be broken down into single-valued attributes in some way.

A main divergence in ways of doing this is based on whether the different values are for stably identifiable sub-attributes.

Generalization hierarchies. Exhaustiveness, disjointness.

Page 18: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Normalization What normalization is and what role it plays in the

database design process.

The normal forms 1NF, 2NF, 3NF, BCNF. 4NF (two versions) was left as optional material.

How tables can be transformed from lower normal forms to higher normal forms.

That normalization and ER modeling are used concurrently to produce a good database design, helping to eliminate data redundancies & anomalies.

That some situations benefit from non-normalization to gain efficiency for some operations.

Page 19: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Creating ER Models/Diagrams Designing an ER model for a database is an iterative

process, because, e.g.: As you proceed, you think of new ways of conceiving what’s

going on (much as in ordinary programming) Multivalued attributes need to be re-represented M:N relationships can be included as such at an early stage,

but generally need to be replaced by bridging entity types at some point

1:1 relationships raise a red flag, though may be justified. Special supertype/subtype notation needs eventually to be

converted into more standard diagram notation, to correspond to the actual tables used.

Conversion to Normal Form (NB: different parts of the DB may have different needs)

Page 20: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

SQL

Mainly, the module only covers basic SQL mechanisms for querying, creating and updating tables.

See manual and textbook for much more if you want!

Page 21: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

MATHEMATICAL VIEW

Page 22: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Tuples, Relations and Tables

A relation on some sets A, B, C, … is simply a set of tuples all of the same length, where in each tuple the first element is from A, the second from B, etc.

A relation is therefore a subset of the Cartesian product of those sets.

A row is a tuple. Hence a table at any given moment induces a relation over the value domains of the table (augmented as appropriate with the value NULL).

The table consists of not just the induced relation but also the attributes themselves, their domains, specification of primary and foreign keys, etc.

Page 23: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Functionality of Relations

Functional relation from A, B, …C to some sets:

for each choice of values from A, B, …, the relation contains at most one tuple starting with those values.

Also called a partial function.

Functional dependence relationship, i.e determination relationship X, Y, …, Z U within a table:

induces a functional relation from the value domains for X, Y, …, Z to the value domain for the determined attribute U.

Page 24: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Some Operations on Sets in General

Union, intersection, difference, symmetric difference and Cartesian product of two sets X and Y (of any sort).

When X and Y are relations:

the set of all possible concatenations of a tuple within X and a tuple within Y.

I have called this the flattened Cartesian product of X and Y, notated as X Y as opposed to X Y.

Often called the relational product in the DB world.

(Shouldn’t be, but often is, called the Cartesian product.)

Page 25: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

Relational DB Operators & Relational Algebra

Defines theoretical way of manipulating tables using “relational DB operators” that mainly manipulate the relations in the tables.

• SELECT

• PROJECT

• JOIN (various sorts)

• INTERSECT

Use of relational DB operators on existing tables produces new tables. Strong connection to SQL commands/operators.

Relational algebra puts relational DB operators into a mathematical notation that is more convenient than, e.g., SQL operators.

• UNION

• DIFFERENCE

• PRODUCT

• (DIVIDE)

Page 26: Fundamentals/ICY: Databases 2010/11 REVISION WEEK John Barnden Professor of Artificial Intelligence School of Computer Science University of Birmingham,

QUESTIONS?