ims 4212: data modeling—super-type/sub-type entities 1 dr. lawrence west, management dept.,...

29
IMS 4212: Data Modeling—Super-Type/Sub-Type Entities 1 Dr. Lawrence West, Management Dept., University of Central Florida [email protected] Super-Type & Sub-Type Entities—Topics Problems needing subtype entities Nature of the solution Variations—Specialization and Completeness Subtype Identifiers Implementing Special Topics Using Super- and Sub-Types Performance Considerations

Upload: blake-walton

Post on 28-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

1Dr. Lawrence West, Management Dept., University of Central [email protected]

Super-Type & Sub-Type Entities—Topics

• Problems needing subtype entities

• Nature of the solution

• Variations—Specialization and Completeness

• Subtype Identifiers

• Implementing

• Special Topics

• Using Super- and Sub-Types

• Performance Considerations

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

2Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities

• Some entities have records that come in various ‘flavors’.

– StudentsDoctoral, Masters, Undergraduate

– ProductsSerial-numbered, perishable, animals, etc.

– EmployeesSalaried, hourly, managerial, part time

– Pet Store Products Food, animals, accessories

• These entity sets have two types of attributes

– Attributes common to every occurrence

– Attributes required by one or more subtypes but not used by all occurrences of the entity

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

3Dr. Lawrence West, Management Dept., University of Central [email protected]

Why is This a Problem?

• Variations on an entity create a space problem

– If we put all possible attributes for all possible variations (subtypes) in one entity we will waste unused fields in most records

– Sport attribute for students who are not athletes

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

4Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities (cont.)

• Subtypes also create relationship problems– Some relationships will only be with a subtype of the

entity, not with all types– In a pet store a veterinarian will inspect the animals

inventory items but probably not the turtle food

Student

PIDLastNameFirstNameStreetAddressCity :DissertationAdvisorID

Faculty

EmployeeIDLastNameFirstName :FacultyTypeRankDoctorallyQualified

Has DoctoralAdvisor

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

5Dr. Lawrence West, Management Dept., University of Central [email protected]

Supertype & Subtype Entities

• It is common to split upentities with variationsinto a supertype andsome subtypes

– Supertype containsattributes common toall occurrences

– Subtypes contain attributes needed by the subtype

S tudent

M astersS tudent

D octora lS tudent

U ndergradS tudent

Student

MastersStudent

DoctoralStudent

Under-graduateStudent

d

ERD Notation

Visio Equivalent

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

6Dr. Lawrence West, Management Dept., University of Central [email protected]

An Example

• Cash is a PaymentTypebut needs no special attributes– Partial Specialization (coming up)

• Payment ID is PK of all entities

• Payment ID is also FK insubtype entities

– In SQL Server besure to set parent this way when implementing relationships

PAYMENT

PaymentIDAccountIDPaymentDatePaymentAmountPaymentType

CHECK_PAYMENT

PaymentIDCheckNum

CC_PAYMENT

PaymentIDCC_TypeCC_NumberSecurityCodeApprovalCode

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

7Dr. Lawrence West, Management Dept., University of Central [email protected]

Need for Subtypes

• Subtypes are used when an identifiable subset of occurrences have a need for fields not needed by all occurrences

– Many occurrences will have empty attribute values

– An occurrence’s membership in the identifiable subset must be observable

• It is known whether a student is registered as an athlete

• But there is no obvious distinction to distinguish ‘local’ students from ‘transient’ students

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

8Dr. Lawrence West, Management Dept., University of Central [email protected]

First Variation on Super-/Subtypes

• Completeness Constraint

– Must every supertype occurrence have at least one occurrence in one of the subtypes?

• Total specialization means thata subtype occurrence must exits

– Indicated with a double lineto the connecting circle

• Partial specialization means thata subtype need not exist

– Indicated with a single line tothe connecting circle

S tudent

S tudent

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

9Dr. Lawrence West, Management Dept., University of Central [email protected]

Total Specialization Completeness Constraint

• Total specialization means that every record in the supertype must have a matching record in one or more subtypes

• Relatively rare (in my experience) but possible

• Model in Visio using a thicker descending line (use Format Line)

– (Visio doesn’t do double lines)

– Increase thickness by two levels

• Watch for SQL Server modeling later

S tudent

STUDENT

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

10Dr. Lawrence West, Management Dept., University of Central [email protected]

Partial Specialization Completeness Constraint

• Some records in supertypes may have no matching subtype records

• Their subtype groups do not needspecial attributes

– But membership in a groupmay still be important andtracked

• It is possible for a suptertype to haveonly one subtype group

STUDENT

ATHLETE

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

11Dr. Lawrence West, Management Dept., University of Central [email protected]

Your Turn

• Model the products in a home improvement store as a supertype/subtype relationship

• Identify categories and any specialized attributes needed

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

12Dr. Lawrence West, Management Dept., University of Central [email protected]

Second Variation on Super-/Subtypes

• You must also determine whether a supertype occurrence can be found in more than one subtype

– A disjoint relationship meansthat a supertype occurrence can only be found in one subtype

– An overlap relationship means that a supertype occurrencecan be found in multiple subtypes(E.g., some universities have ajoint J.D./MBA program)

S tudent

d

S tudent

o

“d”

“o”

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

13Dr. Lawrence West, Management Dept., University of Central [email protected]

Disjoint Relationships

• A registered vehicle canonly be of one type

VEHICLE

VINManufacturerYearWeightType

d

CAR

VINDoorsSeats

TRUCK

VINBedLengthTowingCapacityTailgateType

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

14Dr. Lawrence West, Management Dept., University of Central [email protected]

Overlap Relationships

STUDENT

o

ATHLETEPATIENTINTERN EMPLOYEE VETERAN

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

15Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers

• The supertype entity must indicate which (if any) subtypes are used

• Disjoint subtypes can use one attribute with a code to indicate the type of subtype

– Value of the attribute (‘Cash’, ‘Check’, ‘CC’) identifies the subtype

– Remember that some subtype identifiers (‘Cash’ here) may have no subtype entities

– Sometimes this value may be blank (not part of any group)

PAYMENT

PaymentIDAccountIDPaymentDatePaymentAmountPaymentType

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

16Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers (cont.)

• Overlapping subtypes must use a collection of yes/no attributes, one for each possible subtype

– Setting attribute to true/yes in a record indicates that a matching subtype record exists

– Leaving all to false/no indicates no matching subtype (partial specialization)

STUDENT

PIDLastNameFirstName :InternPatientAthleteEmployeeVeteran

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

17Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtype Identifiers (cont.)

• A subset of subtypes may be disjoint while others are overlap

o

MD…INTERN MASTERS PhD

STUDENT

PIDLastNameFirstName :InternPatientAthleteEmployeeVeteranDegreeSought

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

18Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes of Subtypes

• It is possible to have subtypes of subtypes

• Model products in a pet store where some are inanimate, some are food, some are live and of the live animals some are tracked individually…

– Cute puppies with wet noses

– Cats

• … and others are not

– Goldfish

– Mice

• … and some are sold as food

– Cute little mice as food for slithering scaly snakes

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

19Dr. Lawrence West, Management Dept., University of Central [email protected]

Some Caveats

• ST/ST determined at the group level. Individual records may not have values for all fields

• More than one subtype may have the same field in it

– Field goes in subtype entities if not every subtype group needs it

• Consider eliminating subtypes if they have only one or two attributes

– Roll their attributes back into the suptertype and accept wasted space

– Consider if a large proportion of the population

– Consider if frequently accessed

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

20Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing Super-/Subtypes

• There is a Mandatory-1:Optional-1 relationship between entities in a super and subtype relationship

– Mandatory at supertype end

– Optional at each subtype end

• Each subtype occurrence (record) has identifier attribute values that exactly match a record in the supertype (but not vice-versa)

• All entities have the same primary key/ identifier attributes

• PK in the subtype is also the FK from supertype

– Special case of a weak entity

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

21Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Table Design

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

22Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Relationships

PK is also FK

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

23Dr. Lawrence West, Management Dept., University of Central [email protected]

Implementing in SQL Server—Diagrams

• Arrange in org-chart hierarchy

– Gives visual cue that this is a ST/ST relationship

– You will need to wrestle with the relationship lines a little

• Note Key symbols at both ends of the lines

– Indicates 1:1 Cardinality

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

24Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes of an Unimplemented Supertype

• Many, many data models will have records that could be subtypes of a supertype that is not implemented

• For UCF a “Person” entity could have subtypes

– Student − Donor

– Faculty − Contractor

• Tend to not implement this Person supertype unless the entities are regularly queried together

• Occasional queries can be supported with a UNION query

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

25Dr. Lawrence West, Management Dept., University of Central [email protected]

Subtypes and Object Oriented Design

• Super- and Sub-type design exactly corresponds to the philosophy of inheritance in object oriented design

• If programming using an OO approach you will almost always implement objects with inheritance to match super- and sub-type design

• You can also implement inheritance for the unimplemented supertype discussed in the previous slide, even if not implemented in the DB design

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

26Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables

• Application logic and SQL for super-and sub-type tables becomes more complex

• Inserts must test the subtype identifier to determine where to add records

– Always to the supertype

– Decide which (if any) subtype(s)

• Similar for Updates

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

27Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables (cont.)

• Retrieval also complex

• You cannot simply join the supertype with all subtypes since no records will be returned if a subtype has no match

– Why won’t the following work?

SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment, Check_Payment, CC_PaymentWHERE Payment.PaymentID = Check_Payment.PaymentID AND Payment.PaymentID = CC_Payment.PaymentID AND Payment.PaymentID = 1472

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

28Dr. Lawrence West, Management Dept., University of Central [email protected]

Using Super-/Sub-type Tables (cont.)

• Two query approaches

• Use conditional logic

• Use Left/Right Outer Joins

SELECT Payment.*, Check_Payment.*, CC_Payment.*FROM Payment LEFT JOIN Check_Payment ON Payment.PaymentID = Check_Payment.PaymentID LEFT JOIN CC_Payment ON Payment.PaymentID = CC_Payment.PaymentIDWHERE Payment.PaymentID = 1472

IMS 4212: Data Modeling—Super-Type/Sub-Type Entities

29Dr. Lawrence West, Management Dept., University of Central [email protected]

Performance Considerations

• Because of the performance considerations and complexity of Super- and Sub-types you will regularly consider eliminating subtypes

• Roll up their attributes into the super-type and accept the wasted columns

• Arguments for retaining subtypes

– Several unique attributes, especially large (text) ones

– Relatively few records in the subtype (compared to overall number of records)

– Relatively few transactions use the subtype

• Look at vertical partitioning later in the course