2002.10.10- slide 1is 257 - fall 2002 relational algebra and calculus: introduction to sql...
Post on 21-Dec-2015
214 views
TRANSCRIPT
IS 257 - Fall 2002 2002.10.10- SLIDE 1
Relational Algebra and Calculus:Introduction to SQL
University of California, Berkeley
School of Information Management and Systems
SIMS 257: Database Management
IS 257 - Fall 2002 2002.10.10- SLIDE 2
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 3
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 4
Database Design Process
ConceptualModel
LogicalModel
External Model
Conceptual requirements
Conceptual requirements
Conceptual requirements
Conceptual requirements
Application 1
Application 1
Application 2 Application 3 Application 4
Application 2
Application 3
Application 4
External Model
External Model
External Model
Internal Model
IS 257 - Fall 2002 2002.10.10- SLIDE 5
Cookie ER Diagram
AU_ID
BIBFILE
pubid
LIBFILE
INDXFILE
accno
SUBFILE
libid
CALLFILE
pubidPUBFILE
subcodeaccno subcode
libidaccno
AUTHORS
AU_BIBaccno
AU ID
Author
Note: diagramcontains onlyattributes usedfor linking
IS 257 - Fall 2002 2002.10.10- SLIDE 6
What Problems?
• What sorts of problems and missing features arise given the previous ER diagram?
IS 257 - Fall 2002 2002.10.10- SLIDE 7
Problems Identified
• Subtitles, parallel titles?• Edition information• Series information• lending status• material type designation• Genre, class information• Better codes (ISBN?)• Missing information
(ISBN)
• Authority control for authors
• Missing/incomplete data• Data entry problems• Ordering information• Illustrations• Subfield separation (such
as last_name, first_name)• Separate personal and
corporate authors
IS 257 - Fall 2002 2002.10.10- SLIDE 8
Problems (Cont.)
• Location field inconsistent
• No notes field• No language field• Zipcode doesn’t
support plus-4• No publisher shipping
addresses
• No (indexable) keyword search capability
• No support for multivolume works
• No support for URLs – to online version– to libraries– to publishers
IS 257 - Fall 2002 2002.10.10- SLIDE 9
Original Cookie ER Diagram
AU_ID
BIBFILE
pubid
LIBFILE
INDXFILE
accno
SUBFILE
libid
CALLFILE
pubidPUBFILE
subcodeaccno subcode
libidaccno
AUTHORS
AU_BIBaccno
AU ID
Author
Note: diagramcontains onlyattributes usedfor linking
IS 257 - Fall 2002 2002.10.10- SLIDE 10
nameid
BIBFILE
pubid
LIBFILE
INDXFILE
accno
SUBFILE
libid
CALLFILE
pubidPUBFILE
subcodeaccno subcode
libidaccno
AUTHFILE
AUTHBIB
authtype
accno
nameid
name
Cookie2: Separate Name Authorities
IS 257 - Fall 2002 2002.10.10- SLIDE 11
Cookie 3: Keywords
nameid
BIBFILE
pubid
LIBFILE
INDXFILE
accno
SUBFILE
libid
CALLFILE
pubidPUBFILE
subcodeaccno subcode
libidaccno
AUTHFILE
AUTHBIB
authtype
accno
nameid
name
KEYMAP TERMS
accno termid termid
IS 257 - Fall 2002 2002.10.10- SLIDE 12
Cookie 4: Series
nameid
BIBFILE
pubid
LIBFILE
INDXFILE
accno
SUBFILE
libid
CALLFILE
pubidPUBFILE
subcodeaccno subcode
libidaccno
AUTHFILE
AUTHBIB
authtype
accno
nameid
name
KEYMAP TERMS
accno termid termid
SERIES
seriesid
seriesid
ser_title
IS 257 - Fall 2002 2002.10.10- SLIDE 13
Cookie 5: Circulation
nameid
BIBFILE
pubid
LIBFILE
accno
libid
CALLFILE
pubidPUBFILE
libidaccno
INDXFILE SUBFILE
subcodeaccno subcodeAUTHFILE
AUTHBIB
authtype
accno
nameid
name
KEYMAP TERMS
accno termid termid
SERIES
seriesid
seriesid
ser_title
CIRC
circidcopynumpatronid
PATRON
circid
IS 257 - Fall 2002 2002.10.10- SLIDE 14
Logical Model: Mapping to Relations
• Take each entity– BIBFILE– LIBFILE– CALLFILE– SUBFILE– PUBFILE– INDXFILE
• And make it a table...
IS 257 - Fall 2002 2002.10.10- SLIDE 15
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 16
Relational Algebra Operations
• Select
• Project
• Product
• Union
• Intersect
• Difference
• Join
• Divide
IS 257 - Fall 2002 2002.10.10- SLIDE 17
Select
• Extracts specified tuples (rows) from a specified relation (table).
IS 257 - Fall 2002 2002.10.10- SLIDE 18
Project
• Extracts specified attributes(columns) from a specified relation.
IS 257 - Fall 2002 2002.10.10- SLIDE 19
Product
• Builds a relation from two specified relations consisting of all possible concatenated pairs of tuples, one from each of the two relations. (AKA Cartesian Product)
abc
xy
xyxyxy
aabbcc
Product
IS 257 - Fall 2002 2002.10.10- SLIDE 20
Union
• Builds a relation consisting of all tuples appearing in either or both of two specified relations.
IS 257 - Fall 2002 2002.10.10- SLIDE 21
Intersect
• Builds a relation consisting of all tuples appearing in both of two specified relations
IS 257 - Fall 2002 2002.10.10- SLIDE 22
Difference
• Builds a relation consisting of all tuples appearing in first relation but not the second.
IS 257 - Fall 2002 2002.10.10- SLIDE 23
Join
• Builds a relation from two specified relations consisting of all possible concatenated pairs, one from each of the two relations, such that in each pair the two tuples satisfy some condition. (E.g., equal values in a given col.)
A1 B1A2 B1A3 B2
B1 C1B2 C2B3 C3
A1 B1 C1A2 B1 C1A3 B2 C2
(Naturalor Inner)
Join
IS 257 - Fall 2002 2002.10.10- SLIDE 24
Outer Join
• Outer Joins are similar to PRODUCT -- but will leave NULLs for any row in the first table with no corresponding rows in the second.
A1 B1A2 B1A3 B2A4 B7
B1 C1B2 C2B3 C3
A1 B1 C1A2 B1 C1A3 B2 C2A4 * *
Outer Join
IS 257 - Fall 2002 2002.10.10- SLIDE 25
Divide
• Takes two relations, one binary and one unary, and builds a relation consisting of all values of one attribute of the binary relation that match (in the other attribute) all values in the unary relation.
a
xy
xyzxy
aaabc
Divide
IS 257 - Fall 2002 2002.10.10- SLIDE 26
ER Diagram: Acme Widget Co.
Contains Part
Part# Count
Price
Customer
Quantity
Orders
Cust#
Invoice
Writes
Sales-Rep
Invoice#
Sales
Rep#
Line-ItemContains
Part#
Invoice#
Cust#
Hourly
Employee
ISA
Emp#Wage
IS 257 - Fall 2002 2002.10.10- SLIDE 27
Employee
SSN Lastname Firstname Middlename Birthdate Address123-76-3423 Jones Janet Mary 6/25/1963 234 State342-88-7865 Smith Thomas Frederick 8/4/1970 12 Lambert486-87-6543 Hendersen Charles Robert 9/23/1961 44 Central843-36-7659 Martinez Roberto Garcia 7/8/1958 76 Highland
IS 257 - Fall 2002 2002.10.10- SLIDE 28
Part
Part # Name Price Count1 Big blue widget 3.76 22 Small blue Widget 7.35 43 Tiny red widget 5.25 74 large red widget 157.23 235 double widget rack 10.44 126 Small green Widget 30.45 587 Big yellow widget 7.96 18 Tiny orange widget 81.75 429 Big purple widget 55.99 9
IS 257 - Fall 2002 2002.10.10- SLIDE 29
Sales-Rep
SSN Rep # Sales123-76-3423 1 $12,345.45843-36-7659 2 $231,456.75
HourlySSN Wage342-88-7865 $12.75486-87-6543 $20.50
IS 257 - Fall 2002 2002.10.10- SLIDE 30
Customer
Cust # COMPANY STREET1 STREET2 CITY STATE ZIPCODE
1Integrated Standards Ltd. 35 Broadway Floor 12 New York NY 02111
2 MegaInt Inc. 34 Bureaucracy Plaza Floors 1-172 Phildelphia PA 03756
3 Cyber Associates3 Control Elevation Place
Cyber Assicates Center Cyberoid NY 08645
4General Consolidated 35 Libra Plaza Nashua NH 09242
5Consolidated MultiCorp 1 Broadway Middletown IN 32467
6Internet Behometh Ltd. 88 Oligopoly Place Sagrado TX 78798
7Consolidated Brands, Inc.
3 Independence Parkway Rivendell CA 93456
8 Little Mighty Micro 34 Last One Drive Orinda CA 94563
9 SportLine Ltd. 38 Champion Place Suite 882 Compton CA 95328
IS 257 - Fall 2002 2002.10.10- SLIDE 31
Invoice
Invoice # Cust # Rep #93774 3 184747 4 188367 5 288647 9 1
776879 2 265689 6 2
IS 257 - Fall 2002 2002.10.10- SLIDE 32
Line-Item
Invoice # Part # Quantity93774 3 1084747 23 188367 75 288647 4 3
776879 22 565689 76 1293774 23 1088367 34 2
IS 257 - Fall 2002 2002.10.10- SLIDE 33
Join ItemsPart # Name Price Count
1 Big blue widget 3.76 22 Small blue Widget 7.35 43 Tiny red widget 5.25 74 large red widget 157.23 235 double widget rack 10.44 126 Small green Widget 30.45 587 Big yellow widget 7.96 18 Tiny orange widget 81.75 429 Big purple widget 55.99 9
Invoice # Part # Quantity93774 3 1084747 23 188367 75 288647 4 3
776879 22 565689 76 1293774 23 1088367 34 2
Invoice # Cust # Rep #93774 3 184747 4 188367 5 288647 9 1
776879 2 265689 6 2
Cust # COMPANY STREET1 STREET2 CITY STATE ZIPCODE
1Integrated Standards Ltd. 35 Broadway Floor 12 New York NY 02111
2 MegaInt Inc. 34 Bureaucracy Plaza Floors 1-172 Phildelphia PA 03756
3 Cyber Associates3 Control Elevation Place
Cyber Assicates Center Cyberoid NY 08645
4General Consolidated 35 Libra Plaza Nashua NH 09242
5Consolidated MultiCorp 1 Broadway Middletown IN 32467
6Internet Behometh Ltd. 88 Oligopoly Place Sagrado TX 78798
7Consolidated Brands, Inc.
3 Independence Parkway Rivendell CA 93456
8 Little Mighty Micro 34 Last One Drive Orinda CA 94563
9 SportLine Ltd. 38 Champion Place Suite 882 Compton CA 95328
IS 257 - Fall 2002 2002.10.10- SLIDE 34
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 35
Relational Algebra
• What is the name of the customer who ordered Large Red Widgets?– Select “large Red Widgets” from Part as
temp1– Join temp1 with Line-item on Part # as temp2– Join temp2 with Invoice on Invoice # as temp3– Join temp3 with customer on cust # as temp4– Project Name from temp4
IS 257 - Fall 2002 2002.10.10- SLIDE 36
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 37
Relational Calculus
• Relational Algebra provides a set of explicit operations (select, project, join, etc) that can be used to build some desired relation from the database.
• Relational Calculus provides a notation for formulating the definition of that desired relation in terms of the relations in the database without explicitly stating the operations to be performed
• SQL is based on the relational calculus.
IS 257 - Fall 2002 2002.10.10- SLIDE 38
Lecture Outline
• Review– Design to Relational Implementation
• Relational Operations
• Relational Algebra
• Relational Calculus
• Introduction to SQL
IS 257 - Fall 2002 2002.10.10- SLIDE 39
SQL
• Structured Query Language
• Database Definition and Querying
• Basic language is standardized across relational DBMSs. Each system may have proprietary extensions to standard.
• Relational Calculus combines Select, Project and Join operations in a single command. SELECT.
IS 257 - Fall 2002 2002.10.10- SLIDE 40
SQL - History
• Structured Query Language
• SEQUEL from IBM San Jose
• ANSI 1992 Standard is the version used by most DBMS today (SQL92)
• Basic language is standardized across relational DBMSs. Each system may have proprietary extensions to standard.
IS 257 - Fall 2002 2002.10.10- SLIDE 41
SQL99
• In 1999, SQL99 – also known as SQL3 – was adopted and contains the following eight parts: – The SQL/Framework (75 pages)– SQL/Foundation (1100 pages)– SQL/Call Level Interface (400 pages)– SQL/Persistent Stored Modules (PSM) (160 pages)– SQL/Host Language Bindings (250 pages)– SQL Transactions (??)– SQL Temporal objects (??)– SQL Objects (??)
• Designed to be compatible with SQL92
IS 257 - Fall 2002 2002.10.10- SLIDE 42
SQL99
• The SQL/Framework --SQL basic concepts and general requirements.
• SQL/Call Level Interface (CLI) -- An API for SQL. This is similar to ODBC.
• SQL/Foundation --The syntax and SQL operations that are the basis for the language.
IS 257 - Fall 2002 2002.10.10- SLIDE 43
SQL99
• SQL/Persistent Stored Modules (PSM) --Defines the rules for developing SQL routines, modules, and functions such as those used by stored procedures and triggers. This is implemented in many major RDBMSs through proprietary, nonportable languages, but for the first time we have a standard for writing procedural code that is transportable across databases.
IS 257 - Fall 2002 2002.10.10- SLIDE 44
SQL99
• SQL/Host Language Bindings --Define ways to code embedded SQL in standard programming languages. This simplifies the approach used by CLIs and provides performance enhancements.
• SQL Transactions --Transactional support for RDBMSs.
• SQL Temporal objects --Deal with Time-based data.
• SQL Objects --The new Object-Relational features, which represent the largest and most important enhancements to this new standard.
IS 257 - Fall 2002 2002.10.10- SLIDE 45
SQL99 Data TypesSQL
Data Types
Ref TypesPredefined
TypesArrays
ROWData Struct
User-DefinedTypes
Numeric String DateTime Interval Boolean
Date
Time
Timestamp
Bit Character Blob
Fixed
Varying
CLOB
Fixed
Varying
ApproximateExact
NEWIN SQL99
IS 257 - Fall 2002 2002.10.10- SLIDE 46
SQL Uses
• Database Definition and Querying– Can be used as an interactive query language– Can be imbedded in programs
• Relational Calculus combines Select, Project and Join operations in a single command: SELECT
IS 257 - Fall 2002 2002.10.10- SLIDE 47
SELECT
• Syntax:– SELECT [DISTINCT] attr1, attr2,…, attr3
FROM rel1 r1, rel2 r2,… rel3 r3 WHERE condition1 {AND | OR} condition2 ORDER BY attr1 [DESC], attr3 [DESC]
IS 257 - Fall 2002 2002.10.10- SLIDE 48
SELECT
• Syntax:– SELECT a.author, b.title FROM authors a,
bibfile b, au_bib c WHERE a.AU_ID = c.AU_ID and c.accno = b.accno ORDER BY a.author ;
• Examples in Access...
IS 257 - Fall 2002 2002.10.10- SLIDE 49
SELECT Conditions
• = equal to a particular value• >= greater than or equal to a particular value• > greater than a particular value• <= less than or equal to a particular value• <> not equal to a particular value• LIKE “*term*” (may be other wild cards in other
systems)• IN (“opt1”, “opt2”,…,”optn”)• BETWEEN val1 AND val2• IS NULL
IS 257 - Fall 2002 2002.10.10- SLIDE 50
Relational Algebra Selection using SELECT
• Syntax:– SELECT * WHERE condition1 {AND | OR}
condition2;
IS 257 - Fall 2002 2002.10.10- SLIDE 51
Relational Algebra Projection using SELECT
• Syntax:– SELECT [DISTINCT] attr1, attr2,…, attr3
FROM rel1 r1, rel2 r2,… rel3 r3;
IS 257 - Fall 2002 2002.10.10- SLIDE 52
Relational Algebra Join using SELECT
• Syntax:– SELECT * FROM rel1 r1, rel2 r2 WHERE
r1.linkattr = r2.linkattr ;
IS 257 - Fall 2002 2002.10.10- SLIDE 53
Sorting
• SELECT BIOLIFE.[Common Name], BIOLIFE.[Length (cm)]
FROM BIOLIFE
ORDER BY BIOLIFE.[Length (cm)] DESC;
Note: the square brackets are not part of the standard,But are used in Access for names with embedded blanks
IS 257 - Fall 2002 2002.10.10- SLIDE 54
Subqueries
• SELECT SITES.[Site Name], SITES.[Destination no]
FROM SITES
WHERE sites.[Destination no] IN (SELECT [Destination no] from DEST where [avg temp (f)] >= 78);
• Can be used as a form of JOIN.
IS 257 - Fall 2002 2002.10.10- SLIDE 55
Aggregate Functions
• Count• Avg• SUM• MAX• MIN• Others may be available in different
systems
IS 257 - Fall 2002 2002.10.10- SLIDE 56
Using Aggregate functions
• SELECT attr1, Sum(attr2) AS name FROM tab1, tab2 ...
GROUP BY attr1, attr3 HAVING condition;
IS 257 - Fall 2002 2002.10.10- SLIDE 57
Using an Aggregate Function
• SELECT DIVECUST.Name, Sum([Price]*[qty]) AS Total
FROM (DIVECUST INNER JOIN DIVEORDS ON DIVECUST.[Customer No] = DIVEORDS.[Customer No]) INNER JOIN DIVEITEM ON DIVEORDS.[Order No] = DIVEITEM.[Order No]
GROUP BY DIVECUST.Name HAVING (((DIVECUST.Name) Like
"*Jazdzewski"));
IS 257 - Fall 2002 2002.10.10- SLIDE 58
GROUP BY
• SELECT DEST.[Destination Name], Count(*) AS Expr1
FROM DEST INNER JOIN DIVEORDS ON DEST.[Destination Name] = DIVEORDS.Destination
GROUP BY DEST.[Destination Name]
HAVING ((Count(*))>1);
• Provides a list of Destinations with the number of orders going to that destination
IS 257 - Fall 2002 2002.10.10- SLIDE 59
Create Table
• CREATE TABLE table-name (attr1 attr-type PRIMARYKEY, attr2 attr-type,…,attrN attr-type);
• Adds a new table with the specified attributes (and types) to the database.
IS 257 - Fall 2002 2002.10.10- SLIDE 60
Access Data Types
• Numeric (1, 2, 4, 8 bytes, fixed or float)• Text (255 max)• Memo (64000 max)• Date/Time (8 bytes)• Currency (8 bytes, 15 digits + 4 digits decimal)• Autonumber (4 bytes)• Yes/No (1 bit)• OLE (limited only by disk space)• Hyperlinks (up to 64000 chars)
IS 257 - Fall 2002 2002.10.10- SLIDE 61
Access Numeric types
• Byte – Stores numbers from 0 to 255 (no fractions). 1 byte
• Integer– Stores numbers from –32,768 to 32,767 (no fractions) 2 bytes
• Long Integer (Default) – Stores numbers from –2,147,483,648 to 2,147,483,647 (no fractions).
4 bytes
• Single– Stores numbers from -3.402823E38 to –1.401298E–45 for negative
values and from 1.401298E–45 to 3.402823E38 for positive values.4 bytes
• Double– Stores numbers from –1.79769313486231E308 to –
4.94065645841247E–324 for negative values and from 1.79769313486231E308 to 4.94065645841247E–324 for positive values. 15 8 bytes
• Replication ID– Globally unique identifier (GUID) N/A 16 bytes
IS 257 - Fall 2002 2002.10.10- SLIDE 62
Oracle Data Types
• CHAR (size) -- max 2000• VARCHAR2(size) -- up to 4000• DATE• DECIMAL, FLOAT, INTEGER, INTEGER(s),
SMALLINT, NUMBER, NUMBER(size,d)– All numbers internally in same format…
• LONG, LONG RAW, LONG VARCHAR– up to 2 Gb -- only one per table
• BLOB, CLOB, NCLOB -- up to 4 Gb• BFILE -- file pointer to binary OS file
IS 257 - Fall 2002 2002.10.10- SLIDE 63
Creating a new table from existing tables
• Syntax:– SELECT [DISTINCT] attr1, attr2,…, attr3
INTO newtablename FROM rel1 r1, rel2 r2,… rel3 r3 WHERE condition1 {AND | OR} condition2 ORDER BY attr1 [DESC], attr3 [DESC]
IS 257 - Fall 2002 2002.10.10- SLIDE 64
Alter Table
• ALTER TABLE table-name ADD COLUMN attr1 attr-type;
• … DROP COLUMN attr1;• Adds a new column to an existing
database table.
IS 257 - Fall 2002 2002.10.10- SLIDE 65
INSERT
• INSERT INTO table-name (attr1, attr4, attr5,…, attrK) VALUES (“val1”, val4, val5,…, “valK”);
• Adds a new row(s) to a table.• INSERT INTO table-name (attr1, attr4,
attr5,…, attrK) VALUES SELECT ...
IS 257 - Fall 2002 2002.10.10- SLIDE 66
DELETE
• DELETE FROM table-name WHERE <where clause>;
• Removes rows from a table.
IS 257 - Fall 2002 2002.10.10- SLIDE 67
UPDATE
• UPDATE tablename SET attr1=newval, attr2 = newval2 WHERE <where clause>;
• changes values in existing rows in a table (those that match the WHERE clause).
IS 257 - Fall 2002 2002.10.10- SLIDE 68
DROP Table
• DROP TABLE tablename;• Removes a table from the database.