create the difference normalisation (special thanks to janet francis for this presentation)
TRANSCRIPT
CREATE THE DIFFERENCE
Normalisation
(special thanks to Janet Francis for this presentation)
CREATE THE DIFFERENCE
Aim
• To demonstrate the meaning of normalisation
• To demonstrate how normalisation can be used to good effect
2
CREATE THE DIFFERENCE
Normalisation
• A process which uses a set of rules for grouping data elements into logical entities (relations)
• If followed carefully, it will result in a robust database design
• Each stage in the process results in the production of a structure - a normal form.
• For most purposes, the first three stages (to 3rd normal form – 3NF) are sufficient
3
CREATE THE DIFFERENCE
Un-normalised data• A list of fields needed for the system• Scenario
– All staff are released for two hours a week for staff development.
– Employees work at their own pace in a lab.
– A total of six attributes are recorded about each employee including their normal office location (building and room), the date they joined the course and how many hours it is planned for them to work on it.
CREATE THE DIFFERENCE
Un-normalised dataCourse IDCourse NameEmployee IDNameBuildingRoom IDDate Joined CourseAllocated HoursIn this example, Course ID, Employee ID and Room ID are known to be unique.
CREATE THE DIFFERENCE
Problems
• There is no record of the employee until they have joined a course.
• Lots of duplicate employee data is created once employees start to join courses.
CREATE THE DIFFERENCE
First Normal Form (1NF)
• An entity is in 1NF if it has an identifying key and there are no repeating attributes or repeating groups of attributes
• To get to 1NF we must remove all repeating groups
CREATE THE DIFFERENCE
Remember what we started with• Course details and Employee details
are repeating groups
Course IDCourse NameEmployee IDNameBuildingRoom IDDate Joined CourseAllocated Hours
CREATE THE DIFFERENCE
We need to:• Take one of the unique identifiers eg. “Course
ID” (we could have used Room ID or Employee ID)
• For each of the other attributes, check if they have a one to one relationship with “Course ID”
• If so, keep them, if not move them into a new entity.
• For the new entity, a unique identifier is required and is formed as a composite by using “Course ID” combined with another unique identifier. “Employee ID” is chosen for this example though it could have been Room ID.
CREATE THE DIFFERENCE
Our Example
COURSE EMP_ON_COURSE
Course IDCourse Name
Course ID* , Employee IDNameBuildingRoom IDDate Joined CourseAllocated Hours
NB: Course ID is part of the composite Primary Key of the new entity “EMP_ON_COURSE”.It is also the Foreign Key providing a relationship with COURSE
CREATE THE DIFFERENCE
Problems
• Still problem with employee details• Not so much duplicate data – course
details are now only entered once per course.
CREATE THE DIFFERENCE
Second Normal Form (2NF)
• An entity is in 2NF if it is in 1NF and has no attributes which require only part of the key to identify them uniquely
• To get to 2NF we remove part key dependencies
• All data items must be dependant on the whole of the composite primary key
CREATE THE DIFFERENCE
Not all groups are in 2NF• COURSE is already in 2NF• EMP_ON_COURSE is not because
Attribute Depends On
NameBlockRoom IDDate Joined CourseHours
Employee IDEmployee IDEmployee IDEmployee ID + Course IDEmployee ID + Course ID
CREATE THE DIFFERENCE
So we..
• Take out details that are linked only to “Employee ID” into a separate entity.
• If in any doubt, ask a question such as ‘Are these fields affected when an Emloyee joins a course’Attribute Depends On
NameBuildingRoom ID
Employee IDEmployee IDEmployee ID
CREATE THE DIFFERENCE
….end up with three entities
COURSE EMP_ON_COURSE EMPLOYEE
Course IDCourse Name
Course ID*, Employee ID*Date Joined CourseAllocated Hours
Employee ID NameBuildingRoom ID
The two parts of the composite Primary key in
EMP_ON_COURSE are Foreign keys in the linked tables
CREATE THE DIFFERENCE
Problems• No problem with courses• No problem with Employees• But
– Building and Room Number are related in that a room is in a particular building. If one is updated the other will be affected.
– If the building names change, then with the current structure the whole of the employee records currently stored will have to be updated
CREATE THE DIFFERENCE
Third Normal Form (3NF)
• An entity is in 3NF if it is in 2NF and no non-key attribute depends on another non-key attribute.
• To get to 3NF we must remove attributes that depend on other non-key attributes i.e. resolve the Room and Building problem
CREATE THE DIFFERENCE
We need to:• Decide on the direction of the dependency
between the attributes • For example
– If, given a value for A, there is only one possible value for B, then
• A determines B• B is dependant on A
– So for rooms at Staffordshire University, the room number is unique – we know for example that K342 is in the Octagon Building and C312 is in the Beacon Building.
– If you know the room, you can find out the building - the same is not true vice-versa because if you know the Building you cannot determine the room.
CREATE THE DIFFERENCE
And the solution is..
• Leave Room ID in the original entity as a foreign key, but remove Building into a separate entity with Room ID as the Primary Key.
EMPLOYEE ROOM
Employee IDNameRoom ID*
Room IDBuilding
CREATE THE DIFFERENCE
Entity Relationship Modelling
Course
Emp_On_Course Employee
Room
This is not perfect – why – at least 3 reasons
CREATE THE DIFFERENCE
To Normalise to 3NF1. Remove all repeating data elements
and ensure that everything is dependant on the Primary Key
2. Ensure data items are dependant on the whole of the composite primary key
3. Remove to new entities all fields dependant on non-key fields
This process is sometimes referred to as The key, the whole key and nothing but the key!!
CREATE THE DIFFERENCE
Ways to represent NormalisationTo make it easier to write down
– # represents a numeric field– Primary keys are underlined– Foreign keys* are in Italics with an asteriskThe entities we created would be
represented as:EMPLOYEE (#Employee ID,Name,#Room ID*)ROOM (#Room No, Building)COURSE (#Course ID, Course Name)EMP_ON_COURSE (#Course ID*, #Employee ID*, Date Joined
Course, allocated Hours