the data in a relation to consider atomic data in relations; to consider data types in a relation;...
TRANSCRIPT
The Data in a Relation
•To consider atomic data in relations;
•To consider data types in a relation;
•To consider missing data & NULLs in relations.
Objectives of the Lecture :
The Information Principle
A relation contains only data values, i.e. facts.Each attribute-in-a-tuple contains just a single data value.
In particular there are no pointers or OIDs. Pointers point to variables (because they reference the storage location of the variable) and thus are different in nature to data values.
The reason for this principle is to keep relations simple.
This principle was enunciated by E. F. Codd (the founder of relational DBs) but is still relevant today.
Atomic Data
The data in one attribute of one tuple must be atomic.
Example : in this EMPLOYEE relation there is only one value in each attribute-in-a-tuple.
24
6
8
M-S
6
24
8
2SM
D
8M
W
56
7
8
EName
7
56
8
5SmithMitchell
Robson
8Blake
Jones
12
3
4
ENo
3
12
4
1E3E5
E1
4E6
E8
24
6
8
Sal
6
24
8
212,50021,000
32,500
854,000
68,000
Definition : an atomic value is a single, indivisible value, not a composite value or a collection of values.
This requirement maintainsthe inherent simplicity ofrelations, and is of greatpractical benefit.
Non-Atomic Data
Example : a “relation”containing non-atomicdata.
Part
Bicycle
Frame
Wheel11
40
Quantity12111
ComponentFrameWheel
A-frameHandlebars
SaddleRimAxle
Spoke
These attributes contain non-atomic data(also sometimes called repeating data);
but this attribute does contain atomic data.
Data TypesDefinition : a Data Type has a set of permissible values.
Each attribute value must be drawn from the set of permissible values of the data type specified for that attribute.
More than one attribute in a relation may draw its values from the same data type.
CB111...22...33...
A555....66....77....
222.. 222...44.. 44...66.. 66...
D
Type A
111...22...33...
Type B
555....66....77....
Type C-D
222.. 222...44.. 44...66.. 66...
Using Data Types : an Example
Text(2)
ENo EName M-S Sal
anyText
{ S‚ M, W, D }
{ No > 999 AND No < 100,000 }
111E3 555Smith 222S 22212,50022E51 66X47/35 44A 44500
Too long,3 characters. Wrong letter,
not S, M, W or D.
Too small,less than 999.OK !
Any text.
Data Types - continued
A data type was traditionally known as a domain. The term data type is prefered since it emphasises the inclusion of permissible operators as well as permissible values.
A data type needs logical representations for its values and operators. These are what the user uses. They become part of the logical model.
A data type needs physical representations for its values & executable code for each of its operators. These form the implementation of the logical model.
Examples :-• Number : +, -, , , , apply to all possible numbers.• Text : length( ), sub( ), concat( ) apply to all possible texts.
Definition : a Data Type has a set of permissible operators.
A Variety of Data Types
Commonly needed data types are Built-In, i.e. provided with the DBMS.
Examples : Number, Text, (& usually Date).
Depending on the DBMS, it is also usually possible to add new data types.
Examples : special graphics, pictures, audio, video. These are often called User-Defined.
Such types are often inherently large and complex, and their standards are not as well established as for traditional numeric and text data. Thus they usually require considerable work to create, and are usually bought from a specialist vendor.
Underlying & Specific Data TypesWhen specifying an attribute‘s type, there are typically 2 stages : 1. Specify the general kind of data required, e.g. numbers, text,
dates. This is the Underlying Type. 2. Specify that subset of it that the attribute values must be
limited to, e.g. only the integer numbers 1 .. 1,000. This is the Specific Type. It uses the same operators as the underlying type.
UnderlyingType
Specific Type
Floating pointis a physical
storage format.
SQL Data Types SQL data types are often based on their physical storage.
Examples : in Oracle SQL :-
Char(x) = text of exactly x bytes, i.e. x characters.
Varchar2(x) = variable length text of up to x characters/ bytes maximum.
Integer = Integer number of standard precision.
Number = Floating point number.
SQL attributes/columns are assigned their types when the relation/table is created with the Create Table statement.
Bytes are a unit ofphysical storage.
Create the EMPLOYEE relation :-
Create Table EMPLOYEE ( ENo Char(2), EName Varchar2(30), M-S Char(1)
Check( M-S in (‘S’, ‘M’, ‘W’, ‘D’ ) ), Sal Number
Check( Sal > 999 AND Sal < 100000 )) ;
SQL : Example
Parameters used toconstrain the underlyingtype to the desiredspecific type.
Underlyingdomain.
Set comprehensionapplied via CHECK.
Set enumerationapplied via CHECK.
Missing Data
In the real world, data is not always available to put in the DB.
Reasons include the value being : unknown not available not applicable not yet known undefined to be announced does not exist not supplied etc
Solutions to Missing Data Use a special value to represent missing data.
Example : ‘N/A’, ‘T.B.A.’It gives the reason why the data is missing.
The special value must have the same type as the data that is missing, so it can be stored with the data that is known. This requires no special facility from the DBMS, so it can be used with any DBMS. However it can be somewhat limiting : e.g. prevents ‘N/A’ being used in a numeric type.
Use NULL to represent missing data. NULL is the absence of a value.
NULL 0 NULL ‘ ’ NULL is not a member of any type, because there is no value. Requires special support from the DBMS. SQL DBMSs provide this support. But which reason does it represent ?
space
Comparisons where NULL arises Let X be a numeric variable. Consider the comparison
X = 3 Suppose X is NULL.
Logically an error should arise. In fact SQL treats the NULL as representing an existing but unknown value, and so gives the truth value maybe.
Rationale : We don’t know if X = 3 because X is NULL (= not available); even when the NULL in the DB is caused by the value not existing and it never being possible for it to exist !
Let X and Y be numeric variables. Consider the comparison X = Y
Suppose X and Y are both NULL. In SQL the result is maybe not true.
Thus NULLs require 3 truth values - true, false and maybe.
2-Valued (Boolean) Logic Normal Boolean logic only has the truth values true and false. The result (or value) of any comparison will be true or false. Thus the Boolean logic operators AND, OR and NOT can be
used to combine comparisons, forming an expression. Example :- ( X = 3 AND Y < X ) OR NOT ( Y > 0 )
The value of this expression will be true or false, and can be derived if the values of X and Y are known.
The rules which the logic operators apply to their operands to yield a Boolean result are represented by the truth tables :-
AND T F
T T F
F F F
OR T F
T T T
F T F
NOT
T F
F T
3-Valued Logic
We still want to use logical expressions.Therefore AND, OR and NOT must be extended to cope with maybe.
This gives the following truth tables :-
AND T M F
T T M F
M M M F
F F F F
OR T M F
T T T T
M T M M
F T M F
NOT
T F
M M
F T
SQL : Not NULL Constraint
Sometimes we want to prevent a column from ever holding NULLs.To achieve this in SQL, add a “Not NULL” constraint to the definition of the column.
Example :- Create Table EMPLOYEE (
EmpNo Char(2), EName Varchar2(30) NOT NULL, M-S Char(1), Sal Number ) ; A name must
always beprovided.