enhanced erd
Post on 20-Jan-2016
98 Views
Preview:
DESCRIPTION
TRANSCRIPT
Enhanced E-R Model andEnhanced E-R Model andBusiness RulesBusiness Rules
CS263 Lecture 4
IntroductionIntroduction
The basic ER model was introduced in the mid 1970s Since then business relationships and business data have
become more complex The term Enhanced Entity Relationship (EER) model
refers to the extension of the original ER model
Supertype/Subtype Supertype/Subtype relationshipsrelationships
Most important new modelling construct in the EER model Allows us to model a general entity type (the supertype)
and then subdivide it into several specialised entity types (called subtypes)
Each subtype inherits from its supertype and may have special attributes of its own
Representing supertypes and Representing supertypes and subtypessubtypes
• The supertype is connected with a line to a circle, which in turn is connected by a line to each subtype that has been defined (see Fig.)
• The ‘U’ shaped symbol on each line connecting a subtype to the circle indicates that the subtype is a subset of the supertype, and also indicates the direction of the relationship
• Attributes shared by all the entities are associated with the supertype, whilst attributes that are unique to a particular
subtype are associated with that subtype
Basic notation for supertype/subtype
relationships
An example: the EMPLOYEE An example: the EMPLOYEE supertypesupertype
Suppose that an organisation has 3 basic types of employees:
Hourly: Employee_Number, Employee_Name, Address, Date_Hired, Hourly_Rate
Salaried: Employee_Number, Employee_Name, Address, Date_Hired, Annual_Salary, Stock_Option
Contract consultants: Employee_Number, Employee_Name, Address, Date_Hired, Contract_Number, Billing_Rate
Employee supertypeEmployee supertype
Many attributes the same across 3 types, so we could define a supertype called EMPLOYEE, with subtypes for HOURLY_EMPLOYEE, SALARIED_EMPLOYEE and CONSULTANT (see Fig)
Employee supertype with three subtypes
All employee subtypes will have emp_no., name, address, and date-hired
Each employee subtype will also have its own attributes
When to use When to use supertype/subtype relationssupertype/subtype relations
Should consider using subtypes when either (or both) of the following conditions are present:
1. There are attributes that apply to some (but not all) of the instances of an entity type
2. The instances of a subtype participate in a relationship unique to that subtype
Both are true in the following Fig., where PATIENT has two subtypes: OUTPATIENT and RESIDENT PATIENT (the primary key is PATIENT_ID)
All patients have an Admit_Date and a Patient_Name
Patient examplePatient example
Every patient is cared for by a RESPONSIBLE_PHYSICIAN who develops a treatment plan for the patient
Each subtype also has unique attributes. Outpatients have a Checkback_Date, whilst residents have a Date_Discharged and a unique relationship that assigns each patient to a bed (this is a mandatory relationship, and each bed may or may not be assigned to a patient)
Supertype/subtype relationships of patients
Both outpatients and resident patients are cared for by a responsible physician
Only resident patients are assigned to a bed
Generalization and Generalization and specializationspecialization
Generalization: = the process of defining a more general entity type from a set of more specialized entity types. BOTTOM-UPBOTTOM-UP
Specialization = the process of defining one or more subtypes of the supertype, and forming supertype/subtype relationships. TOP-DOWNTOP-DOWN
Following Figs. Shows Generalisation in a situation where Following Figs. Shows Generalisation in a situation where we have 3 different entity types: CAR, TRUCK and we have 3 different entity types: CAR, TRUCK and MOTORCYCLEMOTORCYCLE
In second Fig. The more general entity type VEHICLE is In second Fig. The more general entity type VEHICLE is shownshown
GeneralisationGeneralisation
MOTORCYCLE is not shown as it does not satisfy MOTORCYCLE is not shown as it does not satisfy conditions for a subtype discussed earlier – the only conditions for a subtype discussed earlier – the only attributes of MOTORCYCLE are those that are common attributes of MOTORCYCLE are those that are common to all vehicles, there are no attributes specific to to all vehicles, there are no attributes specific to motorcycles. Further, MOTORCYCLE does not have a motorcycles. Further, MOTORCYCLE does not have a relationship to another entity typerelationship to another entity type
Example of generalization
Three entity types: CAR, TRUCK, and MOTORCYCLE
All these types of vehicles have common attributes
Generalization to VEHICLE supertype
So we put the shared attributes in a supertype
Note: no subtype for motorcycle, since it has no unique attributes
SpecialisationSpecialisation
In an example of specialisation, an entity type PART has identifier Part_No and other attributes Description, Unit_price, Location, Qty_On_Hand, Routing_Number and Supplier (a multivalued attribute as there may be more than one supplier)
Some parts are internally Manufactured Parts whilst others are externally Purchased Parts (some parts are obtained from both sources – when the choice depends on factors such as manufacturing capacity, unit price of the parts etc.)
SpecialisationSpecialisation
Some attributes apply to all parts regardless of source, others such as Routing_Number depend on the source as they apply only to Manufactured Parts. This suggests that PART should be specialised by defining the subtypes MANUFACTURED_PART and PURCHASED_PART
A new relationship ‘Supplies’ has been created between PURCHASED_PART and SUPPLIER that allows users to more easily associate purchased parts with their suppliers
The attribute Unit_Price is now associated with the relationship ‘Supplies’ so that the price may vary between suppliers
Example of specialization
Entity type PART
Only applies to manufactured
parts
Applies only to purchased parts
Specialization to MANUFACTURED PART and PURCHASED PART
Note: multivalued attribute was replaced by a relationship to another entity
Created 2 subtypes
Completeness constraintsCompleteness constraints
Whether an instance of a supertype must also be a member of at least one subtype. Has two possible rules:
1:Total Specialization Rule: Yes (double line) In following Fig. A PATIENT must either be an OUTPATIENT or a RESIDENT PATIENT. Total specialisation is indicated by the double line extending from the PATIENT identity type to the circle
Examples of completeness constraints
Total specialization rule
A patient must be either an outpatient or a resident patient
Completeness constraintsCompleteness constraints
• Partial Specialization Rule: No (single line) An entity instance of the supertype is allowed not to belong to any subtype.
• In the following Fig. If a VEHICLE is a car it will appear as an instance of CAR, and if a truck as an instance of TRUCK.
• However, if the vehicle is a motorcycle it cannot appear as an instance of any subtype. This example of partial specialisation is specified by the single line from the VEHICLE supertype to the circle
Partial specialization rule
A vehicle could be a car, a truck, or neither
Disjointness constraintDisjointness constraint
Whether an instance of a supertype may simultaneously be a member of two (or more) subtypes. Has two possible rules:
Disjoint Rule: An instance of the supertype can be only ONE of the subtypes. Following Fig. shows that at any one time a PATIENT must be either an outpatient or a resident patient but cannot be both – specified by the letter ‘d’
The subclass of a patient may change over time, but at any
given time a patient is of only one type
Disjoint rule
Examples of disjointness constraints
A patient can either be outpatient or resident, but not both
Disjointness constraintDisjointness constraint
• Overlap Rule: An instance of the supertype can simultaneously be a member of more than one of the subtypes. Some PARTs are both Manufactured and Purchased. An instance of PART is a particular Part Number (i.e. type of part) rather than the individual part itself (Part_No). Considering Part Number 4000, at a given time there may be 250 of this part to hand, of which 100 are manufactured and 150 are purchased. The overlap is specified by placing an ‘o’ in the circle
• Double line suggests any part must be either a purchased part or a manufactured part, or it may simultaneously be both of these
Overlap rule
A part may be both purchased and manufactured
Subtype discriminatorsSubtype discriminators
• Attribute of the supertype whose values determine the target subtype(s)• Disjoint subtypes: a simple attribute with alternative
values to indicate the possible subtypes. In the following Fig. A new attribute ‘Employee_Type’ has been added to the supertype to serve as a subtype discriminator. 3 values ‘H’ = Hourly, ‘S’ = Salaried, ‘C’ = Consultant. This is assigned the correct value when a new employee is added.
Subtype discriminatorsSubtype discriminators
The expression “Employee_Type = “ (the LHS of a condition statement) is placed next to the line leading from the supertype to the circle, with the value of the attribute that selects the appropriate subtype placed adjacent to the line leading to that subtype
Subtype discriminator (disjoint rule)
A simple attribute with different possible values indicating the subtype
Subtype discriminatorsSubtype discriminators
• Overlapping – a composite attribute whose subparts pertain to different subtypes. Each subpart contains a boolean value to indicate whether or not the instance belongs to the associated subtype.
• In the following Fig. a new attribute Part_Type has been added to PART. Part_Type is a composite attribute with components ‘Manufactured?’ and ‘Purchased?’ Each of these attributes is a boolean variable, and can be combined as: Manufactures only = YN, Purchased only = NY, Purchased and manufactured = YY
Subtype discriminator (overlap rule)
A composite attribute with sub-attributes indicating “yes” or “no” to determine whether it is of each subtype
Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies
It is possible for any of the subtypes in these examples to have other subtypes defined on it
A supertype/subtype hierarchy is a hierarchical arrangement of supertypes and subtypes, where each subtype has only one supertype
In modelling the Human resources in a University, the most general entity type would be PERSON (sometimes called the root)
Relevant attributes are SSN (Social Security Number – Identifier), Name, Address, Gender and Date_Of_Birth
Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies
Next we define all major subtypes of the root, here there are 3, EMPLOYEE, STUDENT and ALUMNUS (already graduated)
A person may belong to more than one subtype (such as ALUMNUS and EMPLOYEE) so the overlap rule is used. Overlap allows for any overlap (3-way) so if certain combinations are not allowed a more refined supertype/subtype hierarchy would have to be developed to eliminate these
Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies
The next step is to evaluate whether any of the subtypes already defined qualify for further specialisation. Here employee has two subtypes, FACULTY and STAFF
Because there may be types of employee other than Faculty and Staff, the partial specialisation rule is indicated. However, such an employee cannot be both Faculty and Staff at the same time, so the disjoint rule is indicated in the circle
Defining supertype/subtype Defining supertype/subtype hierarchieshierarchies
Two subtypes are defined for student: GRADUATE_STUDENT and UNDERGRADUATE STUDENT
Notice that total specialisation and the disjoint rule are specified
Example of supertype/subtype hierarchy
Entity clusteringEntity clustering
EER diagrams are difficult to read when there are too many entities and relationships
Solution: group entities and relationships into entity clusters
Entity cluster: set of one or more entity types and associated relationships grouped into a single abstract entity type
Because an entity cluster behaves like an entity type, entity clusters and entity types can be further grouped to form a higher-level entity cluster
Entity clusteringEntity clustering
Entity-clustering is a hierarchical decomposition of a macro-level view of the data model into finer and finer views, eventually resulting in the full, detailed data model. The first of the following Figs. Shows the completed data model with dashed lines drawn around possible entity clusters, and the second shows the final result of transforming this into an EER diagram of only entity clusters and relationships
Entity clusteringEntity clustering
Entity clusters are formed: 1. By abstracting a supertype and its subtypes (see
CUSTOMER) 2. By combining directly related entity types and their
relationships (SELLING_UNIT, ITEM, MATERIAL, MANUFACTURING)
An entity cluster can also be formed by combining a strong entity and its associated weak entity types (not shown here)
Possible entity clusters for Pine Valley Furniture
(PVF)
Related groups of entities could become clusters
EER diagram of PVF entity clusters
More readable, isn’t it?
Business rulesBusiness rules
Statements that define or constrain some aspect of the business. We have already seen some (rules about relationships between entity types [cardinality], supertype/subtype relationships and definitions about attributes and entities). There are 3 main types of business rules:
1. Derivation – rule derived from other knowledge 2. Structural assertion – rule expressing static structure of the
organisation 3. Action assertion – rule expressing constraints/control of
organizational actions Following Fig. Is an EER diagram to describe business rules
EER depiction of business rules
classification
Business rulesBusiness rules
The ER model in the following Fig. Contains 4 entity types: FACULTY, COURSE, SECTION and STUDENT. SECTION is a weak entity type, because it cannot exist without the COURSE entity type
The identifying relationship for SECTION is ‘Is_Scheduled’ and the partial identifier is Section_ID (a composite attribute)
Only selected attributes are shown for the various entity
types to simplify the diagram
Stating a structural assertionStating a structural assertion
Four examples are: 1. A course is a module of instruction in a particular
subject area. This definition of the term course associates the two terms (module and subject area)
2. Student_Name is an attribute of student 3. A student may register for many sections, and a section
may be registered for by many students – this fact states the participation of entity types in a relationship (Is_Registered)
Stating a structural assertionStating a structural assertion
4. A faculty member is an employee of the University. Although not shown in the following Fig., this fact designates a supertype/subtype relationship.
The type of facts above are called base facts, they are fundamental and cannot be derived from other terms or facts
Derived factsDerived facts
A derived fact is a fact that is derived from business rules using an algorithm or inference
A derived attribute is an example of a derived fact Two examples of derived facts follow: 1. Student_GPA (Grade Point Average) =
Quality_Points/Total_Hours_Taken Where Quality_Points = sum [for all courses attempted]
(Credit_Hours*Numerical_Grade) Student_GPA could be shown in the Fig. As a dashed oval
connected to STUDENT
Derived factsDerived facts
2. A student is taught by the faculty assigned to the sections for which the student is registered – this fact can be derived by following the Is_Registered relationship from STUDENT or SECTION and then the Is_Assigned relationship from SECTION to FACULTY
Stating an action assertionStating an action assertion
Action assertions deal with the dynamic aspects of the organisation and impose “must/must not” and “should/should not” constraints on handling data
An action assertion is the property of some business rule (called the anchor object) for a data handling action (such as create, update, delete etc.) and it states how the other business rules (called corresponding objects) act on the anchor object
Stating an action assertionStating an action assertion
Examples of action assertions are: A course (anchor object) must have a course name
(corresponding object). Here the action is updating the course name property of a course
A student (anchor object) must have a value of 2.0 or greater for their Student_GPA (corresponding object) to graduate (action)
A student cannot register for (the anchor object is the Is_Registered relationship) a section of the course for which there is no qualified faculty (the corresponding object is the Is_Qualified relationship)
Types of action assertionsTypes of action assertions
There are 3 ways of classifying action assertions: 1: Based on the type of result from the assertion – 3 sub-
types: I. Condition, which states that if something is true then
another business rule will apply - ‘if a course has a qualified faculty, then the students can register for a section of that course’
II. Integrity constraint, which states something that must always be true – ‘the date a faculty becomes qualified to teach a course cannot be after the semester in which the faculty is assigned to teach a section of that course’
Types of action assertionsTypes of action assertions
III. Authorisation, which states a privilege – ‘only department chairs can qualify a faculty to teach a course’
Types of action assertionsTypes of action assertions
2. Based on the form of the assertion. There are 3 types of assertion:
I. Enabler, which if true permits or leads to the existence of the corresponding object – ‘a faculty can be created once the faculty is qualified to teach at least one course’
II. Timer, which enables/disables or creates/deletes an action – ‘when a student has a GPA above 2.0 and a total credit hours above 125, then the student may graduate’ Here graduating does not occur because of the timer, but the timer enables the action to occur
Types of action assertionsTypes of action assertions
III. Executive, which causes the execution of one or more actions. Can be thought of as a trigger for some action ‘when a student has a GPA below 2.0, then the student goes on academic probation’
Types of action assertionsTypes of action assertions
3. Based on the rigor of the assertion. Looking at action assertions in this way yields 2 types of assertions:
I. Controlling, which state that something must of must not be or happen (above examples fall into this category)
II. Influencing, which are guidelines or items of interest for which a notification must occur – ‘when the number of students exceeds 90% of the capacity of that section, notify the responsible departmental chair’ – in this situation nothing is controlled (students may continue to register) but management wants to know that a particular condition has occurred
Data model segment for class scheduling
Representing and enforcing Representing and enforcing business rulesbusiness rules
The modern approach is to declare action assertions at a conceptual level without specifying how the rule will be implemented
Can do this graphically (EER notation) or using structured grammar
Business rule 1Business rule 1
‘For a faculty member to be assigned to teach a section of a course, they must be qualified to teach the course for which that section is scheduled’
The anchor object is the relationship Is_Assigned, and we are constraining the assignment of the faculty member to the section – the following Fig. Shows a dashed line from Is_Assigned to the action assertion symbol
So in this case there are 2 corresponding objects, represented by the dashed lines from the action assertion symbol to each of the two relationships
Business Rule 1: For a faculty member to be assigned to teach a section of a course, the faculty member must be qualified to teach the course for which that section is scheduled
Action assertion
Anchor object
Corresponding object
Corresponding object
In this case, the action assertion
is a RRestriction
Business rule 2Business rule 2
‘For a faculty member to be assigned to teach a section of the course, the faculty member must not be assigned to teach a total of more than 3 course sections’
This rule imposes a limitation on the total number of sections a faculty member may teach at a given time. Since the rule involves a total, it requires a modification of the previous notation.
As shown in the following Fig., the anchor object is again the relationship Is_Assigned
Business rule 2Business rule 2
However, in this case the corresponding object is also Is_Assigned – it is a count of the total sections assigned to the faculty member
The letters ‘LIM’ in the action assertion symbol stand for ‘limit’
The arrow leaving this symbol then points to a circle with the letter ‘U’ (upper).
The second circle then contains the number 3, which is the upper limit.
Business rule 2Business rule 2
The constraint reads as ‘The corresponding object is a count of the number of sections assigned to the faculty member, which has an upper limit of 3’
If a faculty member is already assigned 3 sections, any transaction that attempts to add another section will be rejected
Business Rule 2: For a faculty member to be assigned to teach a section of a course, the faculty member must not be assigned to teach a total of more than three course sections
Action assertionAnchor object
Corresponding object
In this case, the action assertion is an
UUpper LIMLIMit
Business rules in SQLBusiness rules in SQL
You can implement business rules such as those above using the SQL language, and store the rules as part of the database definitions
One way is to use the CREATE assertion statement, for Business rule 2 we could write:
CREATE ASSERTION Overload_Protect CHECK (SELECT COUNT(*) FROM ASSIGNED WHERE Faculty_ID = ‘12345’) <= 3;
Schema definitionSchema definition The internal schema of a relational database can be
controlled for processing and storage efficiency. Techniques include
Choosing to index primary and/or secondary keys to increase the speed of row selection, table joining, and row ordering. Dropping indexes to increase speed of table updating.
Selecting file organizations for base tables that match the type of processing activity on those tables (e.g. keeping a table sorted by a frequently used reporting sort key)
Schema definitionSchema definition
Selecting file organizations for indexes (which are also tables) appropriate to the way the indexes are used - and allocating extra space for an index file so that an index can grow without having to be reorganised.
Clustering data, so that related rows of frequently joined tables are stored close together in secondary storage to minimise retrieval time
Maintaining statistics about tables and their indexes, so that the DBMS can find the most efficient ways to perform various database operations
Creating indexesCreating indexes
To speed up random/sequential access to base table data. Although users do not directly refer to indexes when writing SQL commands, the DBMS recognises which existing indexes would improve query performance. In this example an alphabetical index on the CUSTOMER_NAME field in the CUSTOMER_T table is created:
CREATE INDEX NAME_IDX ON CUSTOMER_T(CUSTOMER_NAME);
Creating indexesCreating indexes
Indexes may be created or dropped at any time. If data already exists in the key column(s), index population will automatically occur for the existing data
If an index is defined as UNIQUE (CREATE UNIQUE INDEX) and the existing data violate this condition, the index creation will fail
Creating indexesCreating indexes
Once an index is created, it will be updated as data are entered, updated or deleted
When we no longer need an index we can use the DROP statement, for example to remove the index on the customer name in the CUSTOMER table:
DROP INDEX NAME_IDX;
The SELECT StatementThe SELECT Statement Is used for queries on single or multiple tables. It has the following
clauses: SELECT - Lists the columns (and expressions involving columns)
from base tables or views to be projected into the table that will be the result of the command
FROM - Identifies the table(s) or view(s) from which columns will be chosen to appear in the result table, and includes the tables or views needed to join tables to process the query
WHERE - Includes the conditions for row selection within a single table or view, and the conditions between tables or views for joining
SELECTSELECT
The first two are required, and the third is necessary when only certain table rows are to be retrieved, or multiple tables are to be joined
Following Fig. Shows how we can display product name and quantity from the PRODUCT view for all products costing less than $275
SELECT ExampleSELECT Example
Find products with standard price less than $275
SELECT PRODUCT_NAME, STANDARD_PRICE FROM PRODUCT_V WHERE STANDARD_PRICE < 275
Comparison Operators in SQL
Distinct and *Distinct and *
If the user does not want to see duplicate rows in the result, SELECT DISTINCT can be used, so SELECT DISTINCT PRODUCT_NAME would display a results table without duplicate rows
SELECT * (where * is a wildcard to indicate all columns) displays all columns from all the tables or views in the FROM clause
Using expressionsUsing expressions
These are mathematical manipulations of the data in the table
Some are stored functions, such as SUM or AVG e.g. what is the average standard price for each product in
inventory (to be put in the result AVERAGE)? SELECT AVG (STANDARD_PRICE, AS AVERAGE FROM PRODUCT_V;
Using functionsUsing functions Function such as COUNT, MIN, MAX, SUM an AVG of specified
columns in the column list of a SELECT command may be used to specify that the resulting answer table is to contain aggregated data instead of row-level data. Using any of these aggregate functions will give a one-row answer. e.g. using the COUNT aggregate function to find totals, asking how many different items were ordered on order number 1004
SELECT COUNT(*) FROM ORDER_LINE_V WHERE ORDER_ID = 1004;
Note: with aggregate functions you can’t have single-valued columns included in the SELECT clause
Using functionsUsing functions
It is easy to confuse the functions COUNT(*) and COUNT COUNT(*) counts all rows selected by a query, regardless
of whether any of the rows contain null values COUNT tallies only those rows that contain a value, it
ignores null values SUM and AVG can only be used with numeric columns COUNT, COUNT(*), MIN and MAX can be used with
any data type. e.g. using MIN on a text column will find the lowest value in the column - the one which is closest to the beginning of the alphabet
Using wildcardsUsing wildcards
Wildcards can also be used in the WHERE clause if an exact match is not possible
Here the keyword LIKE is paired with wildcard characters and usually a string containing the characters that are known to be the desired matches
The wildcard character ‘%’ is used to represent any collection of characters
e.g. using LIKE ‘%DESK’ when searching PRODUCT_DESCRIPTION will find all the different kinds of desks
Using wildcardsUsing wildcards
The underscore ,_, is used to represent exactly once character
So using LIKE ‘_-drawer’ when searching PRODUCT_NAME will find any products with specified drawers, such as ‘3-drawer’, ‘5-drawer’ etc.
Using Boolean operatorsUsing Boolean operators AND, OR, and NOT Operators for customizing conditions in
WHERE clause. NOT is evaluated first, then AND, then ORThe following query lists product name, finish and unit price for all
desks and all tables that cost more than $300 in the PRODUCT view:
SELECT PRODUCT_DESCRIPTION, PRODUCT_FINISH, STANDARD_PRICE
FROM PRODUCT_V WHERE (PRODUCT_DESCRIPTION LIKE ‘%Desk’ OR PRODUCT_DESCRIPTION LIKE ‘%Table’) AND UNIT_PRICE > 300;
In and not in listsIn and not in lists
To match a list of values, use IN e.g. list all customers who live in the warmer American
states: SELECT CUSTOMER_NAME, CITY, STATE FROM CUSTOMER_V WHERE STATE IN (‘FL’, ’TX’, ‘CA’, ‘HI’);
Sorting resultsSorting results
• ORDER BY - sorts the final result rows in ascending or descending order
• GROUP BY - groups rows in an intermediate results table where the values in those rows are the same for one or more columns
• HAVING - can only be used following GROUP BY and acts as a secondary WHERE clause, returning only those groups which meet a specified condition
ORDER BYORDER BY
e.g. list customer, city and state for all customers in the CUSTOMER view whose address is in Florida, Texas, California or Hawaii. List the customers alphabetically by state, and alphabetically by customer within each state: i.e. sort the results first by STATE, and within a state by CUSTOMER_NAME:
SELECT CUSTOMER_NAME, CITY, STATE FROM CUSTOMER_V WHERE STATE IN (‘FL’, ‘TX’, ‘CA’, ‘HI’) ORDER BY STATE, CUSTOMER_NAME;
GROUP BY GROUP BY
Particularly useful when paired with aggregate functions such as SUM or COUNT
GROUP BY divides a table into subsets (by groups). Then an aggregate function can be used to provide summary information for that group
Scalar aggregate: a single value returned from SQL query using an aggregate function
Vector aggregate: multiple values returned from SQL query with aggregate function (via GROUP BY)
GROUP BYGROUP BY
e.g. count the number of customers with addresses in each state to which we deliver:
SELECT STATE, COUNT(STATE)
FROM CUSTOMER_V
GROUP BY STATE;
GROUP BYGROUP BY
It is also possible to nest groups within groups, the same logic is used as when sorting multiple items
e.g. count the number of customers with addresses in each city to which we deliver. List the cities by state:
SELECT STATE, CITY, COUNT(CITY) FROM CUSTOMER_V GROUP BY STATE, CITY; In general, each column referenced in the SELECT statement
must be referenced in the GROUP BY clause, unless the column is an argument for an aggregate function included in the SELECT clause
HAVINGHAVING
The HAVING clause acts like a WHERE clause, but it identifies groups that meet a criterion, rather than rows. Therefore we usually see a HAVING clause following a GROUP BY
e.g. find only states with more than one customer:SELECT STATE, COUNT(STATE) FROM CUSTOMER_V GROUP BY STATE HAVING COUNT(STATE) > 1;Using WHERE here would not work, because WHERE
does not allow aggregates - also WHERE qualifies rows, whereas HAVING qualifies groups
Using all 6 clausesUsing all 6 clauses
e.g. list the product finish and average standard price for each finish for selected finishes where the average standard price is less than $750:
SELECT PRODUCT_FINISH, AVG(STANDARD_PRICE)
FROM PRODUCT_V WHERE PRODUCT_FINISH IN(‘Cherry’, ‘Natural Ash’, ‘Natural Maple’, ‘White Ash’)
GROUP BY PRODUCT_FINISH HAVING AVG(STANDARD_PRICE) < 750 ORDER BY PRODUCT_FINISH;
Using all 6 clausesUsing all 6 clauses
The following Fig. Shows the order in which SQL processes the clauses of a statement
Arrows indicate the paths that may or may not be followed Only the SELECT and FROM clauses are mandatory The processing order is different from the syntax order As each clause is processed an intermediate results table is
produced that will be used for the next clause (users do not see the intermediate tables, only the final results)
SQL statement processing order
top related