ims 4212: data modeling—super-type/sub-type entities 1 dr. lawrence west, management dept.,...
TRANSCRIPT
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
1Dr. Lawrence West, Management Dept., University of Central [email protected]
Super-Type & Sub-Type Entities—Topics
• Problems needing subtype entities
• Nature of the solution
• Variations—Specialization and Completeness
• Subtype Identifiers
• Implementing
• Special Topics
• Using Super- and Sub-Types
• Performance Considerations
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
2Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities
• Some entities have records that come in various ‘flavors’.
– StudentsDoctoral, Masters, Undergraduate
– ProductsSerial-numbered, perishable, animals, etc.
– EmployeesSalaried, hourly, managerial, part time
– Pet Store Products Food, animals, accessories
• These entity sets have two types of attributes
– Attributes common to every occurrence
– Attributes required by one or more subtypes but not used by all occurrences of the entity
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
3Dr. Lawrence West, Management Dept., University of Central [email protected]
Why is This a Problem?
• Variations on an entity create a space problem
– If we put all possible attributes for all possible variations (subtypes) in one entity we will waste unused fields in most records
– Sport attribute for students who are not athletes
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
4Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities (cont.)
• Subtypes also create relationship problems– Some relationships will only be with a subtype of the
entity, not with all types– In a pet store a veterinarian will inspect the animals
inventory items but probably not the turtle food
Student
PIDLastNameFirstNameStreetAddressCity :DissertationAdvisorID
Faculty
EmployeeIDLastNameFirstName :FacultyTypeRankDoctorallyQualified
Has DoctoralAdvisor
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
5Dr. Lawrence West, Management Dept., University of Central [email protected]
Supertype & Subtype Entities
• It is common to split upentities with variationsinto a supertype andsome subtypes
– Supertype containsattributes common toall occurrences
– Subtypes contain attributes needed by the subtype
S tudent
M astersS tudent
D octora lS tudent
U ndergradS tudent
Student
MastersStudent
DoctoralStudent
Under-graduateStudent
d
ERD Notation
Visio Equivalent
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
6Dr. Lawrence West, Management Dept., University of Central [email protected]
An Example
• Cash is a PaymentTypebut needs no special attributes– Partial Specialization (coming up)
• Payment ID is PK of all entities
• Payment ID is also FK insubtype entities
– In SQL Server besure to set parent this way when implementing relationships
PAYMENT
PaymentIDAccountIDPaymentDatePaymentAmountPaymentType
CHECK_PAYMENT
PaymentIDCheckNum
CC_PAYMENT
PaymentIDCC_TypeCC_NumberSecurityCodeApprovalCode
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
7Dr. Lawrence West, Management Dept., University of Central [email protected]
Need for Subtypes
• Subtypes are used when an identifiable subset of occurrences have a need for fields not needed by all occurrences
– Many occurrences will have empty attribute values
– An occurrence’s membership in the identifiable subset must be observable
• It is known whether a student is registered as an athlete
• But there is no obvious distinction to distinguish ‘local’ students from ‘transient’ students
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
8Dr. Lawrence West, Management Dept., University of Central [email protected]
First Variation on Super-/Subtypes
• Completeness Constraint
– Must every supertype occurrence have at least one occurrence in one of the subtypes?
• Total specialization means thata subtype occurrence must exits
– Indicated with a double lineto the connecting circle
• Partial specialization means thata subtype need not exist
– Indicated with a single line tothe connecting circle
S tudent
S tudent
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
9Dr. Lawrence West, Management Dept., University of Central [email protected]
Total Specialization Completeness Constraint
• Total specialization means that every record in the supertype must have a matching record in one or more subtypes
• Relatively rare (in my experience) but possible
• Model in Visio using a thicker descending line (use Format Line)
– (Visio doesn’t do double lines)
– Increase thickness by two levels
• Watch for SQL Server modeling later
S tudent
STUDENT
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
10Dr. Lawrence West, Management Dept., University of Central [email protected]
Partial Specialization Completeness Constraint
• Some records in supertypes may have no matching subtype records
• Their subtype groups do not needspecial attributes
– But membership in a groupmay still be important andtracked
• It is possible for a suptertype to haveonly one subtype group
STUDENT
ATHLETE
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
11Dr. Lawrence West, Management Dept., University of Central [email protected]
Your Turn
• Model the products in a home improvement store as a supertype/subtype relationship
• Identify categories and any specialized attributes needed
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
12Dr. Lawrence West, Management Dept., University of Central [email protected]
Second Variation on Super-/Subtypes
• You must also determine whether a supertype occurrence can be found in more than one subtype
– A disjoint relationship meansthat a supertype occurrence can only be found in one subtype
– An overlap relationship means that a supertype occurrencecan be found in multiple subtypes(E.g., some universities have ajoint J.D./MBA program)
S tudent
d
S tudent
o
“d”
“o”
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
13Dr. Lawrence West, Management Dept., University of Central [email protected]
Disjoint Relationships
• A registered vehicle canonly be of one type
VEHICLE
VINManufacturerYearWeightType
d
CAR
VINDoorsSeats
TRUCK
VINBedLengthTowingCapacityTailgateType
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
14Dr. Lawrence West, Management Dept., University of Central [email protected]
Overlap Relationships
STUDENT
o
ATHLETEPATIENTINTERN EMPLOYEE VETERAN
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
15Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers
• The supertype entity must indicate which (if any) subtypes are used
• Disjoint subtypes can use one attribute with a code to indicate the type of subtype
– Value of the attribute (‘Cash’, ‘Check’, ‘CC’) identifies the subtype
– Remember that some subtype identifiers (‘Cash’ here) may have no subtype entities
– Sometimes this value may be blank (not part of any group)
PAYMENT
PaymentIDAccountIDPaymentDatePaymentAmountPaymentType
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
16Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers (cont.)
• Overlapping subtypes must use a collection of yes/no attributes, one for each possible subtype
– Setting attribute to true/yes in a record indicates that a matching subtype record exists
– Leaving all to false/no indicates no matching subtype (partial specialization)
STUDENT
PIDLastNameFirstName :InternPatientAthleteEmployeeVeteran
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
17Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtype Identifiers (cont.)
• A subset of subtypes may be disjoint while others are overlap
o
MD…INTERN MASTERS PhD
STUDENT
PIDLastNameFirstName :InternPatientAthleteEmployeeVeteranDegreeSought
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
18Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes of Subtypes
• It is possible to have subtypes of subtypes
• Model products in a pet store where some are inanimate, some are food, some are live and of the live animals some are tracked individually…
– Cute puppies with wet noses
– Cats
• … and others are not
– Goldfish
– Mice
• … and some are sold as food
– Cute little mice as food for slithering scaly snakes
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
19Dr. Lawrence West, Management Dept., University of Central [email protected]
Some Caveats
• ST/ST determined at the group level. Individual records may not have values for all fields
• More than one subtype may have the same field in it
– Field goes in subtype entities if not every subtype group needs it
• Consider eliminating subtypes if they have only one or two attributes
– Roll their attributes back into the suptertype and accept wasted space
– Consider if a large proportion of the population
– Consider if frequently accessed
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
20Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing Super-/Subtypes
• There is a Mandatory-1:Optional-1 relationship between entities in a super and subtype relationship
– Mandatory at supertype end
– Optional at each subtype end
• Each subtype occurrence (record) has identifier attribute values that exactly match a record in the supertype (but not vice-versa)
• All entities have the same primary key/ identifier attributes
• PK in the subtype is also the FK from supertype
– Special case of a weak entity
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
21Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Table Design
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
22Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Relationships
PK is also FK
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
23Dr. Lawrence West, Management Dept., University of Central [email protected]
Implementing in SQL Server—Diagrams
• Arrange in org-chart hierarchy
– Gives visual cue that this is a ST/ST relationship
– You will need to wrestle with the relationship lines a little
• Note Key symbols at both ends of the lines
– Indicates 1:1 Cardinality
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
24Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes of an Unimplemented Supertype
• Many, many data models will have records that could be subtypes of a supertype that is not implemented
• For UCF a “Person” entity could have subtypes
– Student − Donor
– Faculty − Contractor
• Tend to not implement this Person supertype unless the entities are regularly queried together
• Occasional queries can be supported with a UNION query
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
25Dr. Lawrence West, Management Dept., University of Central [email protected]
Subtypes and Object Oriented Design
• Super- and Sub-type design exactly corresponds to the philosophy of inheritance in object oriented design
• If programming using an OO approach you will almost always implement objects with inheritance to match super- and sub-type design
• You can also implement inheritance for the unimplemented supertype discussed in the previous slide, even if not implemented in the DB design
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
26Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables
• Application logic and SQL for super-and sub-type tables becomes more complex
• Inserts must test the subtype identifier to determine where to add records
– Always to the supertype
– Decide which (if any) subtype(s)
• Similar for Updates
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
27Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables (cont.)
• Retrieval also complex
• You cannot simply join the supertype with all subtypes since no records will be returned if a subtype has no match
– Why won’t the following work?
SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment, Check_Payment, CC_PaymentWHERE Payment.PaymentID = Check_Payment.PaymentID AND Payment.PaymentID = CC_Payment.PaymentID AND Payment.PaymentID = 1472
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
28Dr. Lawrence West, Management Dept., University of Central [email protected]
Using Super-/Sub-type Tables (cont.)
• Two query approaches
• Use conditional logic
• Use Left/Right Outer Joins
SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment LEFT JOIN Check_Payment ON Payment.PaymentID = Check_Payment.PaymentID LEFT JOIN CC_Payment ON Payment.PaymentID = CC_Payment.PaymentIDWHERE Payment.PaymentID = 1472
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities
29Dr. Lawrence West, Management Dept., University of Central [email protected]
Performance Considerations
• Because of the performance considerations and complexity of Super- and Sub-types you will regularly consider eliminating subtypes
• Roll up their attributes into the super-type and accept the wasted columns
• Arguments for retaining subtypes
– Several unique attributes, especially large (text) ones
– Relatively few records in the subtype (compared to overall number of records)
– Relatively few transactions use the subtype
• Look at vertical partitioning later in the course