sql, data storage technologies, and web-data integration week 4

62
SQL, Data Storage Technologies, and Web- Data Integration Week 4

Upload: meredith-eaton

Post on 31-Dec-2015

214 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: SQL, Data Storage Technologies, and Web-Data Integration Week 4

SQL, Data Storage Technologies, and Web-Data

IntegrationWeek 4

Page 2: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Today’s Agenda

• Review

• Intro to SQL Continued– SELECT, GROUP BY, HAVING, DELETE,

UPDATE

• “Advanced” SQL– Joins, Functions, Locking tables, Transactions

Page 3: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Week 3 Review• Physical database design

– Column options• NOT NULL, DEFAULT, AUTO_INCREMENT, PRIMARY

KEY

• Client/Server Architecture• Connecting to SQL

– command line mysql client

• Introduction to SQL– SHOW, USE, CREATE, INSERT

Page 4: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Selecting Data

• Syntax– SELECT column_name,… FROM TableName– Use a “*” in place of the column_name list to

retrieve all columns– Examples:

• Show me all the data stored about our donors•mysql> SELECT * FROM Donor;• What are all the names of all our donors?•mysql> SELECT name FROM Donor;

Page 5: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Being More Specific

• Syntax– SELECT column_name,… FROM TableName

WHERE statement– Example

• Show me all the donors with the name “Jake Johnson”

•mysql> SELECT * FROM Donor WHERE name = ‘Jake Johnson’

Page 6: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Conditional OperatorsSQL Symbol Definition

= Matches if the values are equal

!= Matches if the values are not equal

> Matches if the left value is greater than the right value

< Matches if the left value is less than the right value

>= Matches if the left value is greater than or equal to the right value

<= Matches if the left value is less than or equal to the right value

IN (value,…) Matches if value is among the values listed

BETWEEN value AND value

Matches if value is between value1 and value2 or equal to one of them.

LIKE Matches if value matches the pattern expressed in value1 using any series of wildcard characters and anchors.

Page 7: SQL, Data Storage Technologies, and Web-Data Integration Week 4

The LIKE comparison

• Uses wildcard characters to match column data– ‘_’ represent any one character– SELECT name FROM Donor WHERE name

LIKE ‘_ob’• Matches for “Bob”, “Rob”, “Job”, etc.

– ‘%’ represents any number of characters• Select name FROM Fruit WHERE name LIKE

‘%apple’• Matches for “Pineapple” and “Apple”

Page 8: SQL, Data Storage Technologies, and Web-Data Integration Week 4

More LIKE

• Find all donors whose name starts with a "J”:• mysql> SELECT * FROM Donor WHERE name LIKE ‘J%’;

• Use “AND” or “OR” to add multiple restrictions in your WHERE clause

• Find all donors whose name starts with a “J” and have a 206 area code

• mysql> SELECT * FROM Donor WHERE name LIKE “J%” AND phone_number LIKE “206%”;

Page 9: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Other Constraints

• GROUP BY– Fun with aggregates

• HAVING

Page 10: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Ordering Your Data

• Syntax– SELECT column_name,… FROM TableName

ORDER BY column_name, …

• Example– List all donors in alphabetical order– mysql> SELECT * FROM Donor ORDER BY name;

Page 11: SQL, Data Storage Technologies, and Web-Data Integration Week 4

More Ordering

• Feel free to combine with other constraints, such as WHERE– mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name;

• You can order by more than one column– SELECT * FROM Donor ORDER BY lastname,

firstname

• Swap the order with DESC or ASC– mysql> SELECT * FROM Donor WHERE name like ‘J%’ ORDER BY name DESC;

Page 12: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Grouping your data

• The GROUP BY clause groups data together so that aggregate functions can be performed on it.

• Very common for reports and statistics

• More interesting with large sets of data

Page 13: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Piping SQL commands to MySQL

• Sometimes we have a big file of SQL commands that we want to run.

• Quit your mysql client application– mysql> quit;

• Retrieve, and then upload to your dante account:– https://courses.washington.edu/wtcampus/spring/examples/sql/d

onation.sql• Look at the big file of SQL command

– $ less donation.sql• Use a Unix “pipe” to send the file of commands to MySQL

– $ /usr/local/mysql-5.0.67-linux-i686/bin/mysql –u uwnetid –p uwnetid < donation.sql

• The “<“ operator takes all the lines of text from donations.sql, and sends them to MySQL

Page 14: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Back to Grouping

• What does the Donation table look like?• mysql> DESCRIBE Donation;

Column Name Type Options DONATIONID Int unsigned Primary key

auto_increment date Datetime Not null amount Decimal(5.2) Not null processorName Varchar(255) DONORID Int unsigned

Page 15: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Group By

• Syntax– SELECT column_name,… FROM TableName

GROUP BY column_name, …

• Example– What is the total amount donated by each

donor?– mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid;

Page 16: SQL, Data Storage Technologies, and Web-Data Integration Week 4

GROUP BY with other constraints

• GROUP BY must come after any WHERE clause

• GROUP BY must come before any LIMIT or ORDER BY clause

Page 17: SQL, Data Storage Technologies, and Web-Data Integration Week 4

The GroupiesAggregate Function Definition

AVG(column) Returns the average of the column values.

COUNT(column) Returns the number of times the column was not null or had a value.

MAX(column) Returns the maximum value of the column.

MIN(column) Returns the minimum value of the column.

STD(column) Returns the standard deviation of the column values.

SUM(column) Returns the sum of the column values.

Page 18: SQL, Data Storage Technologies, and Web-Data Integration Week 4

GROUP BY

• SELECT column_name (or aggregate function), … FROM TableName WHERE clause GROUP BY column_name, …

• You can GROUP BY multiple columns– Example:

• How many of our donors have the same name?• SELECT fname, lname, COUNT(*) FROM Donor

GROUP BY fname, lname;

Page 19: SQL, Data Storage Technologies, and Web-Data Integration Week 4

HAVING clause

• Syntax– SELECT FROM column_name,… FROM

TableName HAVING statement

• “statement” is the same set of conditionals that the WHERE clause has– So what is the difference between HAVING

and WHERE?

Page 20: SQL, Data Storage Technologies, and Web-Data Integration Week 4

HAVING vs. WHERE

• The WHERE clause happens as MySQL is looking through its table

• The HAVING clause happens on the rows returned by the WHERE clause

• mysql> SELECT * FROM Donor HAVING name = ‘Jake Johnson’;– Twice as slow! First scan the Donor table for

all the rows, then scan all the rows again for names matching ‘Jake Johnson’.

Page 21: SQL, Data Storage Technologies, and Web-Data Integration Week 4

HAVING

• Let's say we're interested in sending a letter to our top donors – those who donated more than $150.

• Use the GROUP BY clause and the SUM aggregate function to get a list of the total amounts.

• Adding the HAVING clause we can further restrict the results.

• mysql> SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid HAVING SUM(amount) > 150;

Page 22: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Deleting rows

• Syntax– DELETE FROM TableName [whereclause]

• Example– mysql> DELETE FROM Donation;

Page 23: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Deleting is too Easy!

• Rows are hard to create, easy to destroy!

• Always use a WHERE clause!

• Example:– mysql> DELETE FROM Donor WHERE name = ‘Jake Johnson’;

• Best to write a SELECT first

Page 24: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Updating data

• Syntax– UPDATE TableName SET column_name =

value [where_clause]

• Let’s learn our lesson from delete, and always use the WHERE clause

• Example:– mysql> UPDATE DONOR SET address = ‘123 Home Lane’, phone_number = ‘555-1212’ WHERE Donorid = 1;

Page 25: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Practice: Using the Aggregates

• How many donations has each donor made?• What is the maximum donation amount made by

each donor?• What are the donorIDs for the top ten donors?• Of those who donated in 2003, what are the

donorIDs of the ten worst donors in 2003?• What is the total amount of donations we’ve

received in 2004?• What are the donorIDs for the ten best and ten

worst donors?

Page 26: SQL, Data Storage Technologies, and Web-Data Integration Week 4

“Advance” SQL

• Joining tables– Inner vs. Outer

• Built in functions

• Table locking

• Transactions

Page 27: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Joining• Our queries on the Donation table only return

the DonorID.• Typically, we want to know the Donor’s name,

not their ID.• We could do two selects and collate the data

– SELECT donorid, SUM(amount) FROM Donation GROUP BY donorid;

– SELECT donorid, name FROM Donor;– match these up in our code

• Or, we could simply do this with one query

Page 28: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Joining• So far we’ve seen SELECTs on a single

table– How is this any better than using a Berkley

DB or text file on a local computer?

• Joins allow us to select information from more than one table and model the relationships in the conceptual model– We don’t want to know the donorIDs, we want

to know the donor names!

Page 29: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Simple Joining

• SELECT * FROM Table1, Table2;– Listing multiple tables after “FROM” joins

those tables together– This effectively creates a new schema, with

new tuples.

• Donor(donorid, name, address)• Donation(donationid, amount, donorid)• DonorDonation(donorid, name, address,

donationid, amount, donorid)

Page 30: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Cartesian (Cross) Product

• New “virtual” schema:– DonorDonation(donorid, name, address, donationid, amount,

donorid)

• New tuples:– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “345 Pl.”, 8, 50.00, 1)– (3, “Joe”, “678 Rd.”, 8, 50.00, 1)

• Every row in table A is joined with every row in table B (A x B).

• mysql> SELECT * FROM Donor, Donation;• 500 Donors x 3000 Donations = 1.5 Million rows!!

Page 31: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Enforcing the relationship

– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 8, 50.00, 1)– (3, “Joe”, “789 Rd.”, 8, 50.00, 1)

• Knowing we have a Donation of (8, 50.00, 1), we are only interested in the row where the Donor was (1, “Bob”, “123 St.”)

• Solution: Use a WHERE clause just like we did before

Page 32: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Enforcing the Relationship

• Our new, new “virtual” schema:– DonorDonation(Donor.donorid, Donor.name,

Donor.address, Donation.donationid, Donation.amount, Donation.donorid)

• Select just the tuples that have matching donorIDs:– mysql> SELECT * FROM Donor, Donation WHERE Donor.donorid = Donation.donorid;

Page 33: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Enforcing the relation

• Tuples now only have data where the donorIDs match– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 19, 175.00, 2)– (2, “Sue”, “456 Pl.”, 33, 25.00, 2)

• Our Donor - Donation relationship is now successfully modeled– One Donor (i.e.: “Sue”) has one or many

Donations (i.e.: 175.00, 25.00)

Page 34: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Refining your join• Just like a single table SELECT statement, you

can refine your multiple table SELECT statement– AND, OR, GROUP BY, HAVING, ORDER BY, LIMIT

• Example: What are the names of the top five donors that have donated at least $150, and how much have they donated?

SELECT Donor.name, SUM(Donation.amount)

FROM Donor, Donation

WHERE Donor.donorid = Donation.donorid

GROUP BY Donor.donorid

HAVING SUM(Donation.amount) > 150

ORDER BY SUM(Donation.amount) DESC

LIMIT 5;

Page 35: SQL, Data Storage Technologies, and Web-Data Integration Week 4

More than two tables

• SELECT * FROM Donor, Donation, Processor WHERE Donor.donorid = Donation.donorid AND Donation.processorid = Processor.processorid;

• Order of the tables in not important

Page 36: SQL, Data Storage Technologies, and Web-Data Integration Week 4

The Equality Test

• Typically you need an equality test for each extra table you add to the FROM clause.

• The equality checks are almost always between the primary key and the foreign keys of tables. (That’s why the foreign keys are there!)

Page 37: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Outer Joins

• The joins we’ve looked at only return a Donor who has made a Donation

• What if we want to know which Donors have not made any Donations?

• The solution is to use an Outer Join (MySQL supports this through the Left Join command)

Page 38: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Outer Joins

• An Outer Join will take all the rows from the Left table (or the Right, depending on the SQL/RDBMS), without requiring a match on the other table.

Page 39: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Outer Joins• SELECT columns FROM Table1 LEFT

JOIN Table2 ON equality_test [WHERE|GROUP BY|etc.]

• Example:– mysql> SELECT Donor.name, Donation.donationid FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid;

Page 40: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Outer Joins

• An outer join will Null fill any columns from Table2 where the ON statement doesn’t match.– (1, “Bob”, “123 St.”, 8, 50.00, 1)– (2, “Sue”, “456 Pl.”, 19, 175.00, 2)– (3, “Joe”, “789 Rd.”, NULL, NULL, NULL)

• If a tuple from Table1 can be joined with any tuple from Table2, it will not be Null filled.

Page 41: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Outer Joins

• What if we want to know which Donors have not made any Donations?

• mysql> SELECT Donor.name, Donation.amount FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid WHERE Donation.amount IS NULL;

Page 42: SQL, Data Storage Technologies, and Web-Data Integration Week 4

A Lot of Typing

• SELECT Donor.name, SUM(Donation.amount), Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid

Page 43: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Less Typing with Aliases

• You can give your Tables nicknames:

• SELECT Dr.name, SUM(Dn.amount), P.name FROM Donor AS Dr LEFT JOIN Donation AS Dn ON Dr.donorid = Dn.donorid LEFT JOIN Processor AS P ON Dn.processorid = P.processorid GROUP BY Dr.donorid

Page 44: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Renaming Output

• You can give your selected columns nicknames too

• SELECT Donor.name, SUM(Donation.amount) AS total, Processor.name FROM Donor LEFT JOIN Donation ON Donor.donorid = Donation.donorid LEFT JOIN ON Processor ON Donation.processorid = Processor.processorid GROUP BY Donor.donorid ORDER BY total

• You can’t always use aggregate functions in your ORDER BY, and you can’t always use them in your HAVING clause

Page 45: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Joining to Other Databases

• Sometimes you may want to share a database with other databases

• Example: You have a “Users” database that is shared between two applications, each of which has its own database.

• SELECT C.name, CN.nickname FROM Users.Customer AS C, CustomerNickname AS CN WHERE C.customerid = CN.customerid;

Page 46: SQL, Data Storage Technologies, and Web-Data Integration Week 4

SQL Functions

• MySQL provides a lot of functions to munge the results of a query

• Example, returning a date– SELECT date FROM Donation WHERE

donationid=1;• 2004-10-14 15:52:08• Not very “pretty” for a user to see

Page 47: SQL, Data Storage Technologies, and Web-Data Integration Week 4

SQL Functions

• Use the FORMAT_DATE() function instead!– SELECT FORMAT_DATE(date, “%m/%d/%y”)

FROM Donation WHERE donationid=1;• 10/13/04• Prettier!

– SELECT FORMAT_DATE(date, “%M %D, %Y”) FROM Donation WHERE donationid=1;

• October 13th, 2004

Page 48: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Some Common FunctionsFunction Result

ABS returns the absolute value of the column CONCAT returns the string formed by joining together all of the function

arguments DATE_ADD returns the date formed by adding a given amount of time to the date DATE_SUB returns the date formed by subtracting a given amount of time from

the date DATE_FORMAT returns a date formatted as you specify in the format string. This is

one of the most useful functions for printing dates in the format you desire

FORMAT returns a neatly formatted number with commas and the specified number of decimal places

ISNULL returns 1 if the value is NULL, zero otherwise LENGTH returns the number of characters in a string NOW returns the current date and time.

Page 49: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Example using functions

• Return a nicely formatted list of donation dates made in the last 10 days

• mysql> SELECT DATE_FORMAT(date, ‘%m/%d/%Y’) FROM Donation WHERE date > DATE_SUB(NOW(), INTERVAL 10 DAY);

Page 50: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Practice: Joining Tables

• Who Processed the most Donations?• Which Donors have made no Donations?• Which Division received the most money?• Which Donor gave the most to Healthcare?

mysql> SOURCE non-profit.sql

Page 51: SQL, Data Storage Technologies, and Web-Data Integration Week 4
Page 52: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Transactions and Table Locking

• Usually, a database is being used concurrently by many different users– Example: Multiple processors will be entering

donation information

• Very important to maintain data integrity

• Transactions or Table Locking can help provide that data integrity

Page 53: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Table Locking

• Example: Say one of our data integrity checks is to make sure no Donor has the same name and address

• In our application, we would probably do something like:

– Check for any "Donor" with the same name and address.

– If there are no matches, insert our new "Donor."

Page 54: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Table Locking• A two step process is okay for one user, very

bad for more than one user• What if this happens:

User 1 User 2

Check for any donor with the name "John Doe."

Check for any donor with the name "John Doe."

No matches

No matches

insert new record

insert new record

Page 55: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Table Locking

• One solution is to use table locking

• User 1 would lock the table so that only s/he can use the table

• They can then check and insert to the table while User 2 waits for the table to become available

Page 56: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Table LockingUser 1 User 2

Lock table

Check for any donor with the name "John Doe."

Wait for table to unlock

No matches Wait for table to unlock

insert new record Wait for table to unlock

Unlock table Wait for table to unlock

Lock table

Check for any donor with the name "John Doe."

Match exists – do NOT insert

Unlock table

Page 57: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Table Locking

• Why not lock everything?

• Slower– Processing overhead to lock tables– Users forced to wait for tables to be unlocked

• Potentially Dangerous– What if you forget to unlock your table?– What if your application crashes?

Page 58: SQL, Data Storage Technologies, and Web-Data Integration Week 4

More on Table Locking

• Syntax:– LOCK TABLE TableName (WRITE|READ), …– UNLOCK TABLES

• Use WRITE for when you need to insert into the table, and READ for when you just need to query the table

• You must lock any table you plan to use between your LOCK TABLE and UNLOCK TABLES commands.– mysql> LOCK TABLE Donor WRITE, Donation READ;

Page 59: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Transactions

• Transactions work by isolating a set of commands such that no other command can alter the data currently being worked on.– Treats a set of commands as one command– Works like table locking in this sense– Also slow like table locking in this sense

Page 60: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Transactions

• Transactions aren’t automatically available on all of your database tables.– You have to be using the InnoDB or BDB

table types– Default table type is MyISAM

• Let’s create one!– mysql> CREATE TABLE Account

(ACCOUNTID INT UNSIGNED PRIMARY KEY AUTO_INCREMENT, balance DOUBLE) ENGINE = InnoDB;

Page 61: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Transactions

• Start a transaction with BEGIN• Transaction isn’t completed until a

COMMIT• Any commands in between are treated as

one command• If something goes wrong, you can

ROLLBACK– reverts the database to the state it was when

you began

Page 62: SQL, Data Storage Technologies, and Web-Data Integration Week 4

Transactions

• Transactions are fairly complex– Not covered in-depth for “Intro to SQL” course

• Transactions are fairly powerful as well– Some external resources– MySQL: MySQL Manual

"1.8.5.3 Transactions and Atomic Operations"http://dev.mysql.com/doc/mysql/en/ANSI_diff_Transactions.html

– Sams Publishing"MySql Transactions Overview“ http://www.samspublishing.com/articles/article.asp?p=29312