example simpsons database cs380 1. querying multi-table databases when we have larger datasets...

26
Example simpsons database CS380 1

Upload: clementine-gibson

Post on 12-Jan-2016

226 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

1

Example simpsons database

Page 2: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

2

Querying multi-table databasesWhen we have larger datasets spread across multiple tables, we need queries that can answer high-level questions such as:

What courses has Bart taken and gotten a B- or better?

What courses have been taken by both Bart and Lisa?

Who are all the teachers Bart has had? How many total students has Ms. Krabappel

taught, and what are their names?

To do this, we'll have to join data from several tables in our SQL queries.

Page 3: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

3

Cross product with JOIN

cross product or Cartesian product: combines each row of first table with each row of second produces M * N rows, where table 1 has M rows

and table 2 has N problem: produces too much

irrelevant/meaningless data

SELECT column(s) FROM table1 JOIN table2; SQL

SELECT * FROM students JOIN grades; SQL

Page 4: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

4

Joining with ON clauses

join: a relational database operation that combines records from two or more tables if they satisfy certain conditions

the ON clause specifies which records from each table are matched

often the rows are linked by their key columns

SELECT column(s)FROM table1

JOIN table2 ON condition(s)...JOIN tableN ON condition(s);

SQLSELECT *FROM students

JOIN grades ON id = student_id; SQL

Page 5: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

5

Join example

table.column can be used to disambiguate column names:

SELECT *FROM students

JOIN grades ON id = student_id; SQL

SELECT *FROM students

JOIN grades ON students.id = grades.student_id; SQL

CS380

Page 6: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

6

Filtering columns in a join

if a column exists in multiple tables, it may be written as table.column

SELECT name, course_id, gradeFROM students

JOIN grades ON students.id = student_id;

SQL

Page 7: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

7

Giving names to tables

can give names to tables, like a variable name in Java

to specify all columns from a table, write table.*

SELECT name, g.*FROM students s

JOIN grades g ON s.id = g.student_id; SQL

Page 8: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

8

Giving names to tables

FROM / JOIN glue the proper tables together, and WHERE filters the results

what goes in the ON clause, and what goes in WHERE? ON directly links columns of the joined tables WHERE sets additional constraints such as particular

values (123, 'Bart')

SELECT name, course_id, gradeFROM students s

JOIN grades g ON s.id = g.student_idWHERE s.id = 123; SQL

Page 9: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

9

More simple join practice

Show the names of the classes that Mr. Krabappel is teaching

Show the classes that Milhouse is taking Print the results on an html table

Page 10: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

10

PHP MySQL functions

name description

mysql_connect connects to a database server

mysql_select_db chooses which database on server to use (similar to SQL USE database; command)

mysql_query performs a SQL query on the database

mysql_real_escape_string encodes a value to make it safe for use in a query

mysql_fetch_array, ... returns the query's next result row as an associative array

mysql_close closes a connection to a database

Page 11: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

11

HTML tables: <table>, <tr>, <td>

table defines the overall table, tr each row, and td each cell's data

tables are useful for displaying large row/column data sets

NOTE: tables are sometimes used by novices for web page layout, but this is not proper semantic HTML and should be avoided

<table><tr><td>1,1</td><td>1,2 okay</td></tr><tr><td>2,1 real wide</td><td>2,2</td></tr>

</table> HTML

Page 12: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

12

Table headers, captions: <th>, <caption>

th cells in a row are considered headers; by default, they appear bold

a caption at the start of the table labels its meaning

<table><caption>My important data</caption>

<tr><th>Column 1</th><th>Column 2</th></tr><tr><td>1,1</td><td>1,2 okay</td></tr><tr><td>2,1 real wide</td><td>2,2</td></tr>

</table> HTML

Page 13: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

13

Styling tables

all standard CSS styles can be applied to a table, row, or cell

table specific CSS properties: border-collapse, border-spacing, caption-side, empty-cells, table-layout

<table { border: 2px solid black; caption-side: bottom; }tr { font-style: italic; }td { background-color: yellow; text-align: center; width: 30%; } CSS

Page 14: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

14

The border-collapse property

by default, the overall table has a separate border from each cell inside

the border-collapse property merges these borders into one

table, td, th { border: 2px solid black; }table { border-collapse: collapse; }

CSS

Page 15: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

15

Table headers, captions: <th>, <caption>

colspan makes a cell occupy multiple columns; rowspan multiple rows

text-align and vertical-align control where the text appears within a cell

<table><tr><th>Column 1</th><th>Column 2</th><th>Column

3</th></tr><tr><td colspan="2">1,1-1,2</td>

<td rowspan="3">1,3-3,3</td></tr><tr><td>2,1</td><td>2,2</td></tr><tr><td>3,1</td><td>3,2</td></tr>

</table> HTML

Page 16: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

16

Table headers, captions: <th>, <caption>

col tag can be used to define styles that apply to an entire column (self-closing)

colgroup tag applies a style to a group of columns (NOT self-closing)

<table><col class="urgent" /><colgroup class="highlight" span="2"></colgroup>

<tr><th>Column 1</th><th>Column 2</th><th>Column 3</th></tr>

<tr><td>1,1</td><td>1,2</td><td>1,3</td></tr><tr><td>2,1</td><td>2,2</td><td>2,3</td></tr>

</table> HTML

Page 17: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

17

Other MySQL PHP functions

name description

mysql_num_rows returns number of rows matched by the query

mysql_num_fields returns number of columns per result in the query

mysql_list_dbs returns a list of databases on this server

mysql_list_tables returns a list of tables in current database

mysql_list_fields returns a list of fields in the current data

complete list

Page 18: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

18

Multi-way join

grade column sorts alphabetically, so grades better than B- are ones <= it

SELECT c.nameFROM courses c

JOIN grades g ON g.course_id = c.idJOIN students bart ON g.student_id = bart.id

WHERE bart.name = 'Bart' AND g.grade <= 'B-'; SQL

Page 19: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

19

A suboptimal query

problem: requires us to know Bart/Lisa's Student IDs, and only spits back course IDs, not names.

Write a version of this query that gets us the course names, and only requires us to know Bart/Lisa's names, not their IDs.

SELECT bart.course_idFROM grades bart

JOIN grades lisa ON lisa.course_id = bart.course_idWHERE bart.student_id = 123

AND lisa.student_id = 888; SQL

What courses have been taken by both Bart and Lisa?

Page 20: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

20

Improved query

SELECT DISTINCT c.nameFROM courses c

JOIN grades g1 ON g1.course_id = c.idJOIN students bart ON g1.student_id = bart.idJOIN grades g2 ON g2.course_id = c.idJOIN students lisa ON g2.student_id = lisa.id

WHERE bart.name = 'Bart'AND lisa.name = 'Lisa'; SQL

What courses have been taken by both Bart and Lisa?

Page 21: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

21

Practice queries

What are the names of all teachers Bart has had?

How many total students has Ms. Krabappel taught, and what are their names?

Page 22: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

22

Practice queries

SELECT DISTINCT t.nameFROM teachers t

JOIN courses c ON c.teacher_id = t.idJOIN grades g ON g.course_id = c.idJOIN students s ON s.id = g.student_id

WHERE s.name = 'Bart'; SQL

What are the names of all teachers Bart has had?

How many total students has Ms. Krabappel taught, and what are their names?

SELECT DISTINCT s.nameFROM students s

JOIN grades g ON s.id = g.student_idJOIN courses c ON g.course_id = c.idJOIN teachers t ON t.id = c.teacher_id

WHERE t.name = 'Krabappel'; SQL

Page 23: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

23

Designing a query

Figure out the proper SQL queries in the following way: Which table(s) contain the critical data?

(FROM) Which columns do I need in the result set?

(SELECT) How are tables connected (JOIN) and values

filtered (WHERE)? Test on a small data set (imdb_small). Confirm on the real data set (imdb). Try out the queries first in the MySQL

console. Write the PHP code to run those same

queries. Make sure to check for SQL errors at every

step!!

Page 24: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

24

Example imdb database

other tables: directors (id, first_name, last_name) movies_directors (director_id, movie_id) movies_genres (movie_id, genre)

Page 25: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

25

IMDb query example

select * from actors where first_name like '%mick%'; SQL

Page 26: Example simpsons database CS380 1. Querying multi-table databases When we have larger datasets spread across multiple tables, we need queries that can

CS380

26

IMDb table relationships / ids