the data in a relation to consider atomic data in relations; to consider data types in a relation;...

17
The Data in a Relation •To consider atomic data in relations; •To consider data types in a relation; •To consider missing data & NULLs in relations. Objectives of the Lecture :

Upload: julian-preston

Post on 02-Jan-2016

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

The Data in a Relation

•To consider atomic data in relations;

•To consider data types in a relation;

•To consider missing data & NULLs in relations.

Objectives of the Lecture :

Page 2: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

The Information Principle

A relation contains only data values, i.e. facts.Each attribute-in-a-tuple contains just a single data value.

In particular there are no pointers or OIDs. Pointers point to variables (because they reference the storage location of the variable) and thus are different in nature to data values.

The reason for this principle is to keep relations simple.

This principle was enunciated by E. F. Codd (the founder of relational DBs) but is still relevant today.

Page 3: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Atomic Data

The data in one attribute of one tuple must be atomic.

Example : in this EMPLOYEE relation there is only one value in each attribute-in-a-tuple.

24

6

8

M-S

6

24

8

2SM

D

8M

W

56

7

8

EName

7

56

8

5SmithMitchell

Robson

8Blake

Jones

12

3

4

ENo

3

12

4

1E3E5

E1

4E6

E8

24

6

8

Sal

6

24

8

212,50021,000

32,500

854,000

68,000

Definition : an atomic value is a single, indivisible value, not a composite value or a collection of values.

This requirement maintainsthe inherent simplicity ofrelations, and is of greatpractical benefit.

Page 4: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Non-Atomic Data

Example : a “relation”containing non-atomicdata.

Part

Bicycle

Frame

Wheel11

40

Quantity12111

ComponentFrameWheel

A-frameHandlebars

SaddleRimAxle

Spoke

These attributes contain non-atomic data(also sometimes called repeating data);

but this attribute does contain atomic data.

Page 5: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Data TypesDefinition : a Data Type has a set of permissible values.

Each attribute value must be drawn from the set of permissible values of the data type specified for that attribute.

More than one attribute in a relation may draw its values from the same data type.

CB111...22...33...

A555....66....77....

222.. 222...44.. 44...66.. 66...

D

Type A

111...22...33...

Type B

555....66....77....

Type C-D

222.. 222...44.. 44...66.. 66...

Page 6: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Using Data Types : an Example

Text(2)

ENo EName M-S Sal

anyText

{ S‚ M, W, D }

{ No > 999 AND No < 100,000 }

111E3 555Smith 222S 22212,50022E51 66X47/35 44A 44500

Too long,3 characters. Wrong letter,

not S, M, W or D.

Too small,less than 999.OK !

Any text.

Page 7: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Data Types - continued

A data type was traditionally known as a domain. The term data type is prefered since it emphasises the inclusion of permissible operators as well as permissible values.

A data type needs logical representations for its values and operators. These are what the user uses. They become part of the logical model.

A data type needs physical representations for its values & executable code for each of its operators. These form the implementation of the logical model.

Examples :-• Number : +, -, , , , apply to all possible numbers.• Text : length( ), sub( ), concat( ) apply to all possible texts.

Definition : a Data Type has a set of permissible operators.

Page 8: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

A Variety of Data Types

Commonly needed data types are Built-In, i.e. provided with the DBMS.

Examples : Number, Text, (& usually Date).

Depending on the DBMS, it is also usually possible to add new data types.

Examples : special graphics, pictures, audio, video. These are often called User-Defined.

Such types are often inherently large and complex, and their standards are not as well established as for traditional numeric and text data. Thus they usually require considerable work to create, and are usually bought from a specialist vendor.

Page 9: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Underlying & Specific Data TypesWhen specifying an attribute‘s type, there are typically 2 stages : 1. Specify the general kind of data required, e.g. numbers, text,

dates. This is the Underlying Type. 2. Specify that subset of it that the attribute values must be

limited to, e.g. only the integer numbers 1 .. 1,000. This is the Specific Type. It uses the same operators as the underlying type.

UnderlyingType

Specific Type

Page 10: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Floating pointis a physical

storage format.

SQL Data Types SQL data types are often based on their physical storage.

Examples : in Oracle SQL :-

Char(x) = text of exactly x bytes, i.e. x characters.

Varchar2(x) = variable length text of up to x characters/ bytes maximum.

Integer = Integer number of standard precision.

Number = Floating point number.

SQL attributes/columns are assigned their types when the relation/table is created with the Create Table statement.

Bytes are a unit ofphysical storage.

Page 11: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Create the EMPLOYEE relation :-

Create Table EMPLOYEE ( ENo Char(2), EName Varchar2(30), M-S Char(1)

Check( M-S in (‘S’, ‘M’, ‘W’, ‘D’ ) ), Sal Number

Check( Sal > 999 AND Sal < 100000 )) ;

SQL : Example

Parameters used toconstrain the underlyingtype to the desiredspecific type.

Underlyingdomain.

Set comprehensionapplied via CHECK.

Set enumerationapplied via CHECK.

Page 12: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Missing Data

In the real world, data is not always available to put in the DB.

Reasons include the value being : unknown not available not applicable not yet known undefined to be announced does not exist not supplied etc

Page 13: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Solutions to Missing Data Use a special value to represent missing data.

Example : ‘N/A’, ‘T.B.A.’It gives the reason why the data is missing.

The special value must have the same type as the data that is missing, so it can be stored with the data that is known. This requires no special facility from the DBMS, so it can be used with any DBMS. However it can be somewhat limiting : e.g. prevents ‘N/A’ being used in a numeric type.

Use NULL to represent missing data. NULL is the absence of a value.

NULL 0 NULL ‘ ’ NULL is not a member of any type, because there is no value. Requires special support from the DBMS. SQL DBMSs provide this support. But which reason does it represent ?

space

Page 14: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

Comparisons where NULL arises Let X be a numeric variable. Consider the comparison

X = 3 Suppose X is NULL.

Logically an error should arise. In fact SQL treats the NULL as representing an existing but unknown value, and so gives the truth value maybe.

Rationale : We don’t know if X = 3 because X is NULL (= not available); even when the NULL in the DB is caused by the value not existing and it never being possible for it to exist !

Let X and Y be numeric variables. Consider the comparison X = Y

Suppose X and Y are both NULL. In SQL the result is maybe not true.

Thus NULLs require 3 truth values - true, false and maybe.

Page 15: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

2-Valued (Boolean) Logic Normal Boolean logic only has the truth values true and false. The result (or value) of any comparison will be true or false. Thus the Boolean logic operators AND, OR and NOT can be

used to combine comparisons, forming an expression. Example :- ( X = 3 AND Y < X ) OR NOT ( Y > 0 )

The value of this expression will be true or false, and can be derived if the values of X and Y are known.

The rules which the logic operators apply to their operands to yield a Boolean result are represented by the truth tables :-

AND T F

T T F

F F F

OR T F

T T T

F T F

NOT

T F

F T

Page 16: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

3-Valued Logic

We still want to use logical expressions.Therefore AND, OR and NOT must be extended to cope with maybe.

This gives the following truth tables :-

AND T M F

T T M F

M M M F

F F F F

OR T M F

T T T T

M T M M

F T M F

NOT

T F

M M

F T

Page 17: The Data in a Relation To consider atomic data in relations; To consider data types in a relation; To consider missing data & NULLs in relations. Objectives

SQL : Not NULL Constraint

Sometimes we want to prevent a column from ever holding NULLs.To achieve this in SQL, add a “Not NULL” constraint to the definition of the column.

Example :- Create Table EMPLOYEE (

EmpNo Char(2), EName Varchar2(30) NOT NULL, M-S Char(1), Sal Number ) ; A name must

always beprovided.