dbms & sql
DESCRIPTION
kTRANSCRIPT
2
Understand database systems Learn how to use SQL to query and update relational
databases, and how to use SQL together with a programming language
Learning Objectives:
3
Chapter Topic Page No1 Database 5
2 Relational Database Management System 21
3 DBMS Assignment 34
4 History of SQL 35
5 Data Definition Language 46
6 Data Modification Language 77
7 Assignment 1 82
8 Assignment 2 85
9 Assignment 3 88
10 Assignment 4 93
11 Assignment 5 96
12 Assignment 6 102
13 Assignment 7 107
14 References & Books 108
15 Reference web sites 109
Table of Contents:
4
Every organization has data that needs to be collected, managed, and analyzed. Most people are familiar with some kind of spreadsheet, such as Microsoft Excel. Spreadsheets are easy and convenient to use, and they may be employed by an individual. Spreadsheets are commonly used to store information in a tabular format. A spreadsheet can store data in rows and columns, it can link cells on one sheet to those on another sheet, and it can force data to be entered in a specific cell in a specific format. It’s easy to calculate formulas from groups of cells on the spreadsheet, create charts, and work with data in other ways.
A database fulfills these needs. Along with the powerful features of a relational database come requirements for developing and maintaining the database. Data analysts, database designers, and database administrators (DBAs) need to be able to translate the data in a database into useful information for both day-to-day operations and long-term planning.
Introduction
5
Database
Originally the database was flat Information was stored in txt file called tab delimited file Each entry in file separated by special character, such as a vertical
bar (|) Difficult to search for specific information
Lname, FName, Age, Salary|Smith, John, 35, $280|Doe, Jane, 28, $325|Brown, Scott, 41, $265|Howard, Shemp, 48, $359|Taylor, Tom, 22, $250
6
Types of Databases
By Function
Analytical Database – Also referred as On-Line Analytical processing (OLAP), are those used to keep track of statistics
• Read only access to analyze the data
Operational Database – Also referred as On-Line Transactional Processing (OLTP), are those, let you actually change and manipulate the data in database
7
Types of Databases
By Data Model
Flat File database model• Data is stored in numerous files• No linkage between files so repetition of information in different files
Relational Database Model• Data can be stored in different table/databases• The tables/databases can be connected using keys
Object oriented Database Model• Stores not only text, but also sounds, images, and all
sorts of media clips
8
Database Systems
The big commercial database vendors: Oracle IBM (with DB2) Microsoft (SQL Server) Sybase Teradata
Some free database systems (Unix) : Postgres MySQL Predator
9
What is DBMS?
Need for information management A very large, integrated collection of data. Models real-world enterprise.
Entities (e.g., students, courses) Relationships
A Database Management System (DBMS) is a software package designed to store and manage databases.
10
Why Use a DBMS?
Data independence and efficient access. Data integrity and security. Uniform data administration. Concurrent access, recovery from crashes. Replication control Reduced application development time.
11
Why Study Databases??
Shift from computation to information at the “low end”: access to physical world at the “high end”: scientific applications
Datasets increasing in diversity and volume. Digital libraries, interactive video, Human Genome project, e-
commerce, sensor networks ... need for DBMS/data services exploding
DBMS encompasses several areas of CS OS, languages, theory, AI, multimedia, logic
?
12
Data Models
A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using
the a given data model. The relational model of data is the most widely used model
today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or
fields.
13
Data Models
A data model is a collection of concepts for describing data. A schema is a description of a particular collection of data, using
the a given data model. The relational model of data is the most widely used model
today. Main concept: relation, basically a table with rows and columns. Every relation has a schema, which describes the columns, or
fields.
14
Key
Primary Key – a field or fields making every record unique and cannot be null. Ex – SSO id in Emp_Table in Genpact
Foreign Key – a field in one table matches the primary key of another table. Ex – CoE in COE_Table is primary key while CoE field in Emp_Table is foreign key
15
Key
Emp_Table
SSO_Id Emp_Name CoE123456 David D'SouzaIndustrial234567 Ram Kumar Analytics345678 Naveen P External
COE_Table
COE_ID COE_NameICoE IndustrialACoE AnalyticsECoE External
Primary Key Foreign Key
16
Example of a Traditional Database Application
Suppose we are building a system
to store the information about: Employee CoE Supervisor who goes where, who reports whom
17
Can we do it without a DBMS ?
Sure we can! Start by storing the data in files:
Employee.txt coe.txt supervisor.txt
Now write a programs to implement specific tasks
18
Doing it without a DBMS...
Add “XYZ” in “Analytics”:
Read ‘Employee.txt’Read ‘CoE.txt’Find&update the record “XYZ”Find&update the record “Analytics”Write “Employee.txt”Write “coe.txt”
Read ‘Employee.txt’Read ‘CoE.txt’Find&update the record “XYZ”Find&update the record “Analytics”Write “Employee.txt”Write “coe.txt”
Write a algorithm to do the following:
19
Problems without an DBMS...
System crashes:
What is the problem ?
Large data sets (say 50GB) Why is this a problem ?
Simultaneous access by many users Lock employee.txt – what is the problem ?
Read ‘Employee.txt’Read ‘CoE.txt’Find&update the record “XYZ”Find&update the record “Analytics”Write “Employee.txt”Write “coe.txt”
Read ‘Employee.txt’Read ‘CoE.txt’Find&update the record “XYZ”Find&update the record “Analytics”Write “Employee.txt”Write “coe.txt”
CRASH !
20
Enters a DBMS
Data files
Database server(someone else’s
C program) Applications
connection
(ODBC, JDBC)
“Two tier system” or “client-server”
21
What is RDBMS ?
Relational Database Management System
RDBMS stands for:•Relational •Database•Management •System
It is a general purpose software system that facilitates the process of defining , constructing and manipulating a database.
It helps in defining data types , structures and constraints
It provides functions such as:•querying the database to retrieve specific data•Updating the database to reflect the changes in the miniworld•Generating reports from the data
22
How is data represented in RDBMS ?
Relational Database Management System Contd...
Data is represented by a collection of relations-Some cardholders make transactions -Every account is sent a statement
•Each relation is depicted as a table -Account
•Each row in a table represents a Tuple-Account Number 1 , Account Number 2 , etc
•Attributes of an entity are stored in different columns-First Name , Last Name , Credit Line
23
What is a relational database?
Relational Database Management System Contd…
A relational database is a collection of related data
Data represents known facts that can be recorded and have an implicit meaning
•Each database consists of various database objects. Some of the database objects are:
-Tables, Views , Indexes , Partitions , -Constraints-Meta Data
24
Relational Database Management System Contd…
What are the benefits of a Relational Database?
•It provides a mechanism to organize data which helps in reducing data redundancy
•Each logical data item is stored in one place allowing a consistent way to store data
•Access to information can be controlled depending upon the role of the user
Popular Products : Oracle , IBM DB2 , MySQL, Sybase etc
Lets learn about different database objects in the next slide.
25
What is a Table?Database Objects
Data is stored in Rows & Columns•Rows represents different tuple•Columns represents attributes of entities•Ex: Account_Dim in CDCI-Columns : Account Number , First Name , Last Name
Account Number First Name Last Name
ABC George Bush
DBC Manmohan Singh
Row 1
Row 2
Col 1 Col 2 Col 3
26
What is a View?
Database Objects
A view is a single table derived from other table(s) in the databaseAre dynamically updated whenever the tables are refreshedAlso called as Virtual Tables
•Ex: Account_Dim_Level1, Account_Dim_Level2 in CDCI•Has been used to support field masking in CDCI
27
What is an Index?
Database Objects
Pointers to physical location of the information in DBDefined on a field(s) in the DBUsed to retrieve or update information faster in DB
•Ex: Account_Key in Account_Dim
28
What are partitions ?
Database Objects
Statement Fact
STMT_20060301 STMT_20060303STMT_20060302
SAMS CLUB
Wal-Mart
SAMS CLUB
Wal-Mart
SAMS CLUB
Wal-Mart
Table
Billing Cycle
Date
Client 1
Client 2
Partitions are a way of dividing tables & indexes into “Manageable Pieces” Paritions can be spread out in different locations /disksPartitions support ParallelismTypes : List , Range , Hash
•Ex: Posting_date in Transaction_fact•Billing_cycle_date in Statement_Fact
29
What are Constraints? Constraints are restrictions imposed on data Constraints help in enforcing consistency in representation of data Constraints enforces integrity into data Types
• Domain Constraints- Account Number can only be of 16 digits
• Key Constraint- No 2 account numbers can have the same account key
• Entity Integrity- Account Key cannot be null on Account_dim
• Referential Integrity- Every transaction made by cardholder needs to be associated with the cardholder’s
information in Account_dim
• Foreign Key Constraint- Account Key in transaction fact is a primary key in Account_dim
• NOT NULL constraint- Account number cannot be NULL on Account_dim
Database Objects…
30
What is Meta Data
Meta Data
All Tables Output All Indexes Output
Meta Data is “data about data” Information about data stored in the DBEx: All_tables – stores information about every table in the DBAll_Indexes – stores information of every index available in the DB
31
Are Spreadsheets Like Databases?
More than one data type can be stored in a spreadsheet column.
Cells in a spreadsheet can be defined as a formula, making the contents variable depending on other cells.
A spreadsheet has only the physical row number to make it unique, and no built-in way to enforce uniqueness of a given spreadsheet row.
Usually, only one user can have write access to the spreadsheet at any given time; anyone else is locked out, even if the second user is on a different part of the spreadsheet.
A spreadsheet does not have any built-in transaction-control capabilities, such as ensuring that a group of changes to the sheet is completely applied or not applied at all. The Save button is about the best a spreadsheet can do to simulate transaction control.
A corrupt spreadsheet cannot usually be repaired; the entire spreadsheet must be restored from a backup, which may have occurred yesterday, last week, or never!
Usually, only one data type can be stored in a database table column.
Columns in a database table have a fixed value.
Single rows of a database table are uniquely identified by a unique value (typically a primary key, as described later in this chapter).
Multiple users can access a database table at the same time, with various combinations of read and write capabilities in different parts of the database.
A database usually has transaction-control capabilities, making it possible to “roll back” a change if something happened to prevent it from completing successfully (such as a power failure).
There are many tools for repairing and recovering databases.
Spreadsheet Database
34
• What is the difference between DBMS & RDBMS?
• What is Primary Key in any database table?
• Can Primary key be null?
• What is Metadata?
• How do you differentiate between a View and a table?
• Row of a table represents “Tuple”. Is it true?
• What is the advantage of creating an index on a table?
• If Account_Schdl_No is Primary Key in table “Account”. It can store null value as well. Is this statement true?
Database Assignments
35
An influential paper, "A Relational Model of Data for Large Shared Data Banks", by Dr. Edgar F. Codd, was published in June, 1970 in the Association for Computing Machinery (ACM) journal, Communications of the ACM, although drafts of it were circulated internally within IBM in 1969. Codd's model became widely accepted as the definitive model for relational database management systems (RDBMS or RDMS).
History of SQL
36
During the 1970s, a group at IBM's San Jose research center developed a database system "System R" based upon, but not strictly faithful to, Codd's model. Structured English Query Language ("SEQUEL") was designed to manipulate and retrieve data stored in System R. The acronym SEQUEL was later condensed to SQL because the word 'SEQUEL' was held as a trademark by the Hawker-Siddeley aircraft company of the UK. Although SQL was influenced by Codd's work, Donald D. Chamberlin and Raymond F. Boyce at IBM were the authors of the SEQUEL language design.[1] Their concepts were published to increase interest in SQL.
History
37
The first non-commercial, relational, non-SQL database, Ingres, was developed in 1974 at U.C. Berkeley.In 1978, methodical testing commenced at customer test sites. Demonstrating both the usefulness and practicality of the system, this testing proved to be a success for IBM. As a result, IBM began to develop commercial products based on their System R prototype that implemented SQL, including the System/38 (announced in 1978 and commercially available in August 1979), SQL/DS (introduced in 1981), and DB2 (in 1983).[1]
History
38
At the same time Relational Software, Inc. (now Oracle Corporation) saw the potential of the concepts described by Chamberlin and Boyce and developed their own version of a RDBMS for the Navy, CIA and others. In the summer of 1979 Relational Software, Inc. introduced Oracle V2 (Version2) for VAX computers as the first commercially available implementation of SQL. Oracle is often incorrectly cited as beating IBM to market by two years, when in fact they only beat IBM's release of the System/38 by a few weeks. Considerable public interest then developed; soon many other vendors developed versions, and Oracle's future was ensured.
History
39
StandardizationSQL was adopted as a standard by ANSI (American National Standards Institute) in 1986 and ISO (International Organization for Standardization) in 1987. ANSI has declared that the official pronunciation for SQL is /ɛs kjuː ɛl/, although many English-speaking database professionals still pronounce it as sequel.
History
40
Year Name Alias Comments
1986 SQL-86 SQL-87 First published by ANSI. Ratified by ISO in 1987.
1989 SQL-89 Minor revision.
1992 SQL-92 SQL2 Major revision.
1999 SQL:1999 SQL3 Added regular expression matching, recursive queries, triggers, non-scalar types and some object-oriented features. (The last two are somewhat controversial and not yet widely supported.)
2003 SQL:2003 Introduced XML-related features, window functions, standardized sequences and columns with auto-generated values (including identity-columns).
History
41
SQLSQL is a syntax for querying and manipulating relational databases. It was originally known as SEQUEL (Structured English Query Language), but this was shortened to SQL due to a trademark dispute. You can use SQL to read data from a database. Such queries can be quite sophisticated - you can choose which columns of the table to extract, you can use conditional expressions to decide which rows to extract, you can sort the result, and limit the number of rows returned. It is also possible to "join" tables, that is to retrieve data from multiple related tables in a single query. SQL also allows you to insert and modify records in a table. In that sense, the term "query" is something of a misnomer. You can also create, modify or delete entire tables within the database using SQL queries.
What is SQL?
42
SQL stands for Structured Query Language.It is the most commonly used relational database language today.SQL works with a variety of different fourth-generation (4GL) programming languages, such as Visual Basic.
Topic: cont..
43
Data ManipulationData Manipulation Data DefinitionData Definition Data AdministrationData Administration All are expressed as an SQL statement or command.All are expressed as an SQL statement or command.
Topic: cont..
44
Represent all info in database as tablesRepresent all info in database as tables Keep logical representation of data independent from its physical storage characteristicsKeep logical representation of data independent from its physical storage characteristics Use one high-level language for structuring, querying, and changing info in the databaseUse one high-level language for structuring, querying, and changing info in the database Support the main relational operationsSupport the main relational operations Support alternate ways of looking at data in tablesSupport alternate ways of looking at data in tables Provide a method for differentiating between unknown values and nulls (zero or blank)Provide a method for differentiating between unknown values and nulls (zero or blank) Support Mechanisms for integrity, authorization, transactions, and recoverySupport Mechanisms for integrity, authorization, transactions, and recovery
Topic: cont..
45
SQL commands can be divided into two main sublanguages. The SQL commands can be divided into two main sublanguages. The Data Definition Language (DDL) contains the commands used to Data Definition Language (DDL) contains the commands used to create and destroy databases and database objects. After the create and destroy databases and database objects. After the database structure is defined with DDL, database administrators database structure is defined with DDL, database administrators and users can utilize the Data Manipulation Language to insert, and users can utilize the Data Manipulation Language to insert, retrieve and modify the data contained within it.retrieve and modify the data contained within it.
Topic: cont..
47
The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project. Let's take a look at the structure and usage of four basic DDL commands:
The Data Definition Language (DDL) is used to create and destroy databases and database objects. These commands will primarily be used by database administrators during the setup and removal phases of a database project. Let's take a look at the structure and usage of four basic DDL commands:
Topic: Data Definition Language
48
SQL Data Definition Language (DDL)The Data Definition Language (DDL) part of SQL permits database tables to be created or deleted. We can also define indexes (keys), specify links between tables, and impose constraints between database tables.The most important DDL statements in SQL are: CREATE TABLE - creates a new database table ALTER TABLE - alters (changes) a database table DROP TABLE - deletes a database table CREATE INDEX - creates an index (search key) DROP INDEX - deletes an index
Topic: cont ..
49
CREATE
CREATE command can be used for this purpose. The command:
CREATE TABLE personal_info (first_name char(20) not null, last_name char(20) not null, employee_id int not null)
establishes a table titled "personal_info" in the current database. In our example, the table contains three attributes: first_name, last_name and employee_id.
Topic: Create Table
50
Once you've created a table within a database, you may wish to modify the definition of it. The ALTER command allows you to make changes to the structure of a table without deleting and recreating it. Take a look at the following command:
ALTER TABLE personal_infoADD salary money null
This example adds a new attribute to the personal_info table -- an employee's salary. The "money" argument specifies that an employee's salary will be stored using a dollars and cents format. Finally, the "null" keyword tells the database that it's OK for this field to
contain no value for any given employee.
Topic: Alter Table
51
The final command of the Data Definition Language, DROP, allows us to remove entire database objects from our DBMS. For example, if we want to permanently remove the personal_info table that we created, we'd use the following command:
DROP TABLE personal_info
Similarly, the command below would be used to remove the entire employees database:
DROP DATABASE employees
Use this command with care! Remember that the DROP command removes entire data structures from your database. If you want to remove individual records, use the DELETE command of the Data Manipulation Language
Topic: Drop Table
52
Tables are the basic structure where data is stored in the database. Given that in most cases, there is no way for the database vendor to know ahead of time what your data storage needs are, chances are that you will need to create tables in the database yourself. Many database tools allow you to create tables without writing SQL, but given that tables are the container of all the data, it is important to include the CREATE TABLE syntax in this tutorial.
Topic: Create Table
53
Before we dive into the SQL syntax for CREATE TABLE, it is a good idea to understand what goes into a table. Tables are divided into rows and columns. Each row represents one piece of data, and each column can be thought of as representing a component of that piece of data. So, for example, if we have a table for recording customer information, then the columns may include information such as First Name, Last Name, Address, City, Country, Birth Date, and so on. As a result, when we specify a table, we include the column headers and the data types for that particular column.
Topic: Create Table
54
So what are data types? Typically, data comes in a variety of forms. It could be an integer (such as 1), a real number (such as 0.55), a string (such as 'sql'), a date/time expression (such as '2000-JAN-25 03:22:22'), or even in binary format. When we specify a table, we need to specify the data type associated with each column (i.e., we will specify that 'First Name' is of type char(50) - meaning it is a string with 50 characters). One thing to note is that different relational databases allow for different data types, so it is wise to consult with a database-specific reference first.
Topic: Create Table
55
The SQL syntax for CREATE TABLE isCREATE TABLE "table_name"("column 1" "data_type_for_column_1","column 2" "data_type_for_column_2",... )So, if we are to create the customer table specified as above, we would type inCREATE TABLE customer(First_Name char(50),Last_Name char(50),Address char(50),City char(50),Country char(25),Birth_Date date)
Topic: Create Table
56
Views can be considered as virtual tables. Generally speaking, a table has a set of definition, and it physically stores the data. A view also has a set of definitions, which is build on top of table(s) or other view(s), and it does not physically store the data.The syntax for creating a view is as follows:CREATE VIEW "VIEW_NAME" AS "SQL Statement""SQL Statement" can be any of the SQL statements we have discussed in this tutorial.
Topic: SQL View
57
Let's use a simple example to illustrate. Say we have the following table:TABLE Customer(First_Name char(50),Last_Name char(50),Address char(50),City char(50),Country char(25),Birth_Date date)
and we want to create a view called V_Customer that contains only the First_Name, Last_Name, and Country columns from this table, we would type in,CREATE VIEW V_CustomerAS SELECT First_Name, Last_Name, CountryFROM Customer
Topic: SQL View
58
Now we have a view called V_Customer with the following structure: View V_Customer(First_Name char(50),Last_Name char(50),Country char(25))
We can also use a view to apply joins to two tables. In this case, users only see one view rather than two tables, and the SQL statement users need to issue becomes much simpler. Let's say we have the following two tables:
Topic: SQL View
59
Topic: ViewTable Store_Information
store_name Sales Date
Los Angeles $1,500 Jan-05-1999
San Diego $250 Jan-07-1999
Los Angeles $300 Jan-08-1999
Boston $700 Jan-08-1999
Table Geography
region_name
store_name
East Boston
East New York
WestLos Angeles
West San Diego
and we want to build a view that has sales by region information. We would issue the following SQL statement:CREATE VIEW V_REGION_SALESAS SELECT A1.region_name REGION, SUM(A2.Sales) SALESFROM Geography A1, Store_Information A2WHERE A1.store_name = A2.store_nameGROUP BY A1.region_nameThis gives us a view, V_REGION_SALES, that has been defined to store sales by region records. If we want to find out the content of this view, we type in, SELECT * FROM V_REGION_SALES
60
Topic: SQL View
Indexes help us retrieve data from tables quicker. Let's use an example to illustrate this point: Say we are interested in reading about how to grow peppers in a gardening book. Instead of reading the book from the beginning until we find a section on peppers, it is much quicker for us to go to the index section at the end of the book, locate which pages contain information on peppers, and then go to these pages directly. Going to the index first saves us time and is by far a more efficient method for locating the information we need.
61
Topic: Create Index
The same principle applies for retrieving data from a database table. Without an index, the database system reads through the entire table (this process is called a 'table scan') to locate the desired information. With the proper index in place, the database system can then first go through the index to find out where to retrieve the data, and then go to these locations directly to get the needed data. This is much faster.
62
Topic: Create Index
Therefore, it is often desirable to create indexes on tables. An index can cover one or more columns. The general syntax for creating an index is:CREATE INDEX "INDEX_NAME" ON "TABLE_NAME" (COLUMN_NAME)Let's assume that we have the following table, TABLE Customer(First_Name char(50),Last_Name char(50),Address char(50),City char(50),Country char(25),Birth_Date date)
63
Topic: Create Index
and we want to create an index on the column Last_Name, we would type in,CREATE INDEX IDX_CUSTOMER_LAST_NAMEon CUSTOMER (Last_Name)If we want to create an index on both City and Country, we would type in,CREATE INDEX IDX_CUSTOMER_LOCATIONon CUSTOMER (City, Country)
There is no strict rule on how to name an index. The generally accepted method is to place a prefix, such as "IDX_", before an index name to avoid confusion with other database objects. It is also a good idea to provide information on which table and column(s) the index is used on.
64
Topic: Alter Table
Once a table is created in the database, there are many occasions where one may wish to change the structure of the table. Typical cases include the following: Add a column Drop a column Change a column name Change the data type for a column Please note that the above is not an exhaustive list. There are other instances where ALTER TABLE is used to change the table structure, such as changing the primary key specification.
65
Topic: Alter Table
The SQL syntax for ALTER TABLE is ALTER TABLE "table_name"[alter specification] [alter specification] is dependent on the type of alteration we wish to perform. For the uses cited above, the [alter specification] statements are: Add a column: ADD "column 1" "data type for column 1" Drop a column: DROP "column 1" Change a column name: CHANGE "old column name" "new column name" "data type for new column name" Change the data type for a column: MODIFY "column 1" "new data type"
66
Topic: Alter Table
Let's run through examples for each one of the above, using the "customer" table created in the CREATE TABLE section:
Table customer
Column Name Data Type
First_Name char(50)
Last_Name char(50)
Address char(50)
City char(50)
Country char(25)
Birth_Date date
First, we want to add a column called "Gender" to this table. To do this, we key in: ALTER table customer add Gender char(1)
Table customer
Column Name Data Type
First_Name char(50)
Last_Name char(50)
Address char(50)
City char(50)
Country char(25)
Birth_Date date
Gender char(1)
67
Topic: Alter Table
Next, we want to rename "Address" to "Addr". To do this, we key in,
ALTER table customer change Address Addr char(50)
Table customer
Column Name Data Type
First_Name char(50)
Last_Name char(50)
Addr char(50)
City char(50)
Country char(25)
Birth_Date date
Gender char(1)
68
Topic: Alter Table
Then, we want to change the data type for "Addr" to char(30). To do this, we key in, ALTER table customer modify Addr char(30)
Resulting table structure:
Table customer
Column Name Data Type
First_Name char(50)
Last_Name char(50)
Addr char(30)
City char(50)
Country char(25)
Birth_Date date
Gender char(1)
69
Topic: Alter Table
Finally, we want to drop the column "Gender". To do this, we key in, ALTER table customer drop Gender Resulting table structure:
Table customer
Column Name Data Type
First_Name char(50)
Last_Name char(50)
Addr char(30)
City char(50)
Country char(25)
Birth_Date date
70
Topic: Primary Key
A primary key is used to uniquely identify each row in a table. It can either be part of the actual record itself , or it can be an artificial field (one that has nothing to do with the actual record). A primary key can consist of one or more fields on a table. When multiple fields are used as a primary key, they are called a composite key. Primary keys can be specified either when the table is created (using CREATE TABLE) or by changing the existing table structure (using ALTER TABLE).
71
Topic: Primary Key
Below are examples for specifying a primary key when creating a table:
CREATE TABLE Customer (SID integer PRIMARY KEY, Last_Name varchar(30), First_Name varchar(30));
Below are examples for specifying a primary key by altering a table:
ALTER TABLE Customer ADD PRIMARY KEY (SID);
72
Topic: Foreign Key
A foreign key is a field (or fields) that points to the primary key of another table. The purpose of the foreign key is to ensure referential integrity of the data. In other words, only values that are supposed to appear in the database are permitted. For example, say we have two tables, a CUSTOMER table that includes all customer data, and an ORDERS table that includes all customer orders. The constraint here is that all orders must be associated with a customer that is already in the CUSTOMER table. In this case, we will place a foreign key on the ORDERS table and have it relate to the primary key of the CUSTOMER table. This way, we can ensure that all orders in the ORDERS table are related to a customer in the CUSTOMER table. In other words, the ORDERS table cannot contain information on a customer that is not in the CUSTOMER table.
73
Topic: Foreign Key
The structure of these two tables will be as follows:
Table CUSTOMER
column namecharacteristic
SIDPrimary Key
Last_Name
First_Name
Table ORDERS
column namecharacteristic
Order_IDPrimary Key
Order_Date
Customer_SIDForeign Key
Amount
In the above example, the Customer_SID column in the ORDERS table is a foreign key pointing to the SID column in the CUSTOMER table.
74
Topic: Foreign Key
Below are examples for specifying a foreign key by altering a table. This assumes that the ORDERS table has been created, and the foreign key has not yet been put in:
CREATE TABLE ORDERS (Order_ID integer primary key, Order_Date date, Customer_SID integer references CUSTOMER(SID), Amount double);
ALTER TABLE ORDERS ADD (CONSTRAINT fk_orders1) FOREIGN KEY (customer_sid) REFERENCES CUSTOMER(SID);
75
Topic: Drop Table
Sometimes we may decide that we need to get rid of a table in the database for some reason. In fact, it would be problematic if we cannot do so because this could create a maintenance nightmare for the DBA's. Fortunately, SQL allows us to do it, as we can use the DROP TABLE command. The syntax for DROP TABLE isDROP TABLE "table_name"So, if we wanted to drop the table called customer that we created in the CREATE TABLE section, we simply typeDROP TABLE customer.
76
Topic: Truncate Table
Sometimes we wish to get rid of all the data in a table. One way of doing this is with DROP TABLE, which we saw in the last section. But what if we wish to simply get rid of the data but not the table itself? For this, we can use the TRUNCATE TABLE command. The syntax for TRUNCATE TABLE isTRUNCATE TABLE "table_name"So, if we wanted to truncate the table called customer that we created in SQL CREATE, we simply type,TRUNCATE TABLE customer
78
SELECT Statement
The SELECT statement is used to query the database and retrieve selected data that match the criteria that you specify. The SELECT statement has five main clauses to choose from, although, FROM is the only required clause. Each of the clauses have a vast selection of options, parameters, etc. The clauses will be listed below, but each of them will be covered in more detail later in the tutorial.
Here is the format of the SELECT statement:
SELECT [ALL | DISTINCT] column1[,column2]FROM table1[,table2][WHERE "conditions"][GROUP BY "column-list"][HAVING "conditions][ORDER BY "column-list" [ASC | DESC] ]
79
Example 1.1:SELECT name, age, salaryFROM employeeWHERE age > 50;
The above statement will select all of the values in the name, age, and salary columns from the employee table whose age is greater than 50.
Note: Remember to put a semicolon at the end of your SQL statements. The ; indicates that your SQL statment is complete and is ready to be interpreted
= Equal
> Greater than
< Less than
>= Greater than or equal to<= Less than or equal to
<> or !=
Not equal to
LIKE String comparison test
Comparison Operators
80
SELECT name, title, deptFROM employeeWHERE title LIKE 'Pro%';
The above statement will select all of the rows/values in the name, title, and dept columns from the employee table whose title starts with 'Pro'. This may return job titles including Programmer or Pro-wrestler.
Example: Using Comparison Operators
81
ALL and DISTINCT are keywords used to select either ALL (default) or the "distinct" or unique records in your query results. If you would like to retrieve just the unique records in specified columns, you can use the "DISTINCT" keyword. DISTINCT will discard the duplicate records for the columns you specified after the "SELECT" statement
USING ALL AND DISTINCT
SELECT DISTINCT age FROM employee_info;
This statement will return all of the unique ages in the employee_info table.
ALL will display "all" of the specified columns including all of the duplicates. The ALL keyword is the default if nothing is specified.
Note: The following two tables will be used throughout this course. It is recommended to have them open in another window or print them out
Example:
82
Assignment 1:
1. From the items_ordered table, select a list of all items purchased for customerid 10449. Display the customerid, item, and price for this customer.
2. Select all columns from the items_ordered table for whoever purchased a Tent.
3. Select the customerid, order_date, and item values from the items_ordered table for any items in the item column that start with the letter "S".
4. Select the distinct items in the items_ordered table. In other words, display a listing of each of the unique items from the items_ordered table.
5. Make up your own select statements and submit them
83
Aggregate Functions :
MIN returns the smallest value in a given column
MAX returns the largest value in a given column
SUM returns the sum of the numeric values in a given column
AVG returns the average value of a given column
COUNT returns the total number of values in a given column
COUNT(*) returns the number of rows in a table
Aggregate functions are used to compute against a "returned column of numeric data" from your SELECT statement. They basically summarize the results of a particular column of selected data. We are covering these here since they are required by the next topic, "GROUP BY". Although they are required for the "GROUP BY" clause, these functions can be used without the "GROUP BY" clause.
84
Examples:
SELECT AVG(salary)FROM employee;
This statement will return a single result which contains the average value of everything returned in the salary column from the employee table.
SELECT AVG(salary)FROM employee;WHERE title = 'Programmer';
This statement will return the average salary for all employees whose title is equal to 'Programmer'
SELECT Count(*)FROM employees;
This particular statement is slightly different from the other aggregate functions since there isn't a column supplied to the count function. This statement will return the number of rows in the employees table..
85
Assignment 2:
1. Select the maximum price of any item ordered in the items_ordered table. Hint: Select the maximum price only.>
2. Select the average price of all of the items ordered that were purchased in the month of Dec.
3. What are the total number of rows in the items_ordered table?
4. For all of the tents that were ordered in the items_ordered table, what is the price of the lowest tent? Hint: Your query should return the price only
86
GROUP BY clause:
The GROUP BY clause will gather all of the rows together that contain data in the specified column(s) and will allow aggregate functions to be performed on the one or more columns.
GROUP BY clause syntax:SELECT column1, SUM(column2)…column(n)FROM Table 1….GROUP BY column1…Column(n)
87
Examples:Let's say you would like to retrieve a list of the highest paid salaries in each dept:
SELECT max(salary), deptFROM employee GROUP BY dept;
This statement will select the maximum salary for the people in each unique department. Basically, the salary for the person who makes the most in each department will be displayed. Their, salary and their department will be returned.
Let's say you want to group everything of quantity 1 together, everything of quantity 2 together, everything of quantity 3 together, etc. If you would like to determine what the largest cost item is for each grouped quantity (all quantity 1's, all quantity 2's, all quantity 3's, etc.), you would enter:
SELECT quantity, max(price)FROM items_orderedGROUP BY quantity;
Enter the statement in above, and take a look at the results to see if it returned what you were expecting. Verify that the maximum price in each Quantity Group is really the maximum price.
88
Assignment 3:
1. How many people are in each unique state in the customers table? Select the state and display the number of people in each. Hint: count is used to count rows in a column, sum works on numeric data only.
2. From the items_ordered table, select the item, maximum price, and minimum price for each specific item in the table. Hint: The items will need to be broken up into separate groups.
3. How many orders did each customer make? Use the items_ordered table. Select the customerid, number of orders they made, and the sum of their orders. Click the Group By answers link below if you have any problems.
89
HAVING clause:
The HAVING clause allows you to specify conditions on the rows for each group - in other words, which rows should be selected will be based on the conditions you specify. The HAVING clause should follow the GROUP BY clause if you are going to use it.
HAVING clause syntax:
SELECT column1, SUM(column2)FROM "list-of-tables"GROUP BY "column-list"HAVING "condition";
90
Examples:
Let's say you have an employee table containing the employee's name, department, salary, and age. If you would like to select the average salary for each employee in each department, you could enter:
SELECT dept, avg(salary)FROM employeeGROUP BY dept;
But, let's say that you want to ONLY calculate & display the average if their salary is over 20000:
SELECT dept, avg(salary)FROM employeeGROUP BY deptHAVING avg(salary) > 20000;
91
ORDER BY clause:
ORDER BY is an optional clause which will allow you to display the results of your query in a sorted order (either ascending order or descending order) based on the columns that you specify to order by.
ORDER BY clause syntax:
SELECT column1, SUM(column2)FROM "list-of-tables"ORDER BY "column-list" [ASC | DESC];[ ] = optional
ASC = Ascending Order - defaultDESC = Descending Order
92
Examples:
This statement will select the employee_id, dept, name, age, and salary from the employee_info table where the dept equals 'Sales' and will list the results in Ascending (default) order based on their Salary.:
SELECT employee_id, dept, name, age, salaryFROM employee_infoWHERE dept = 'Sales'ORDER BY salary;
If you would like to order based on multiple columns, you must seperate the columns with commas. For example:
SELECT employee_id, dept, name, age, salaryFROM employee_infoWHERE dept = 'Sales'ORDER BY salary, age DESC;
93
Assignment 4:
1. Select the lastname, firstname, and city for all customers in the customers table. Display the results in Ascending Order based on the lastname.
2. Same thing as exercise #1, but display the results in Descending order.
3. Select the item and price for all of the items in the items_ordered table that the price is greater than 10.00. Display the results in Ascending order based on the price.
94
Combining conditions and Boolean Operators:
The AND operator can be used to join two or more conditions in the WHERE clause. Both sides of the AND condition must be true in order for the condition to be met and for those rows to be displayed.
SELECT column1, SUM(column2)FROM "list-of-tables"WHERE "condition1" AND "condition2";
The OR operator can be used to join two or more conditions in the WHERE clause also. However, either side of the OR operator can be true and the condition will be met - hence, the rows will be displayed. With the OR operator, either side can be true or both sides can be true.
95
Examples:SELECT employeeid, firstname, lastname, title, salaryFROM employee_infoWHERE salary >= 50000.00 AND title = 'Programmer';
This statement will select the employeeid, firstname, lastname, title, and salary from the employee_info table where the salary is greater than or equal to 50000.00 AND the title is equal to 'Programmer'. Both of these conditions must be true in order for the rows to be returned in the query. If either is false, then it will not be displayed.
Although they are not required, you can use paranthesis around your conditional expressions to make it easier to read:
SELECT employeeid, firstname, lastname, title, salaryFROM employee_infoWHERE (salary >= 50000.00) AND (title = 'Programmer');
SELECT firstname, lastname, title, salaryFROM employee_infoWHERE (title = 'Sales') OR (title = 'Programmer');
This statement will select the firstname, lastname, title, and salary from the employee_info table where the title is either equal to 'Sales' OR the title is equal to 'Programmer'.
96
Assignment 5:
1. Select the customerid, order_date, and item from the items_ordered table for all items unless they are 'Snow Shoes' or if they are 'Ear Muffs'. Display the rows as long as they are not either of these two items.
2. Select the item and price of all items that start with the letters 'S', 'P', or 'F'.
97
IN and BETWEEN Conditional Operators:
SELECT col1, SUM(col2)FROM "list-of-tables"WHERE col3 IN (list-of-values);
SELECT col1, SUM(col2)FROM "list-of-tables"WHERE col3 BETWEEN value1 AND value2;
The IN conditional operator is really a set membership test operator. That is, it is used to test whether or not a value (stated before the keyword IN) is "in" the list of values provided after the keyword IN.
98
Examples:
SELECT employeeid, lastname, salaryFROM employee_infoWHERE lastname IN ('Hernandez', 'Jones', 'Roberts', 'Ruiz');
This statement will select the employeeid, lastname, salary from the employee_info table where the lastname is equal to either: Hernandez, Jones, Roberts, or Ruiz. It will return the rows if it is ANY of these values.
The IN conditional operator can be rewritten by using compound conditions using the equals operator and combining it with OR - with exact same output results:
SELECT employeeid, lastname, salaryFROM employee_infoWHERE lastname = 'Hernandez' OR lastname = 'Jones' OR lastname = 'Roberts'OR lastname = 'Ruiz';
As you can see, the IN operator is much shorter and easier to read when you are testing for more than two or three values.
You can also use NOT IN to exclude the rows in your list.
99
Examples:
The BETWEEN conditional operator is used to test to see whether or not a value (stated before the keyword BETWEEN) is "between" the two values stated after the keyword BETWEEN.
SELECT employeeid, age, lastname, salaryFROM employee_info WHERE age BETWEEN 30 AND 40;
This statement will select the employeeid, age, lastname, and salary from the employee_info table where the age is between 30 and 40 (including 30 and 40).
This statement can also be rewritten without the BETWEEN operator:
SELECT employeeid, age, lastname, salaryFROM employee_infoWHERE age >= 30 AND age <= 40;
You can also use NOT BETWEEN to exclude the values between your range.
100
ABS(x) returns the absolute value of x
SIGN(x) returns the sign of input x as -1, 0, or 1 (negative, zero, or positive respectively)
MOD(x,y) modulo - returns the integer remainder of x divided by y (same as x%y)
FLOOR(x) returns the largest integer value that is less than or equal to x
CEILING(x) or CEIL(x)
returns the smallest integer value that is greater than or equal to x
POWER(x,y) returns the value of x raised to the power of y
ROUND(x) returns the value of x rounded to the nearest whole integer
ROUND(x,d) returns the value of x rounded to the number of decimal places specified by the value d
SQRT(x) returns the square-root value of x
101
Examples:
SELECT round(salary), firstnameFROM employee_info
This statement will select the salary rounded to the nearest whole value and the firstname from the employee_info table.
102
Assignment 6:
1. Select the item and per unit price for each item in the items_ordered table. Hint: Divide the price by the quantity.
2. Click the exercise answers link below if you have any problems.
103
Table Joins:
All of the queries up until this point have been useful with the exception of one major limitation - that is, you've been selecting from only one table at a time with your SELECT statement. It is time to introduce you to one of the most beneficial features of SQL & relational database systems - the "Join". To put it simply, the "Join" makes relational database systems "relational".Joins allow you to link data from two or more tables together into a single query result--from one single SELECT statement.A "Join" can be recognized in a SQL SELECT statement if it has more than one table after the FROM keyword
SELECT "list-of-columns"FROM table1,table2WHERE "search-condition(s)"
104
Joins can be explained easier by demonstrating what would happen if you worked with one table only, and didn't have the ability to use "joins". This single table database is also sometimes referred to as a "flat table". Let's say you have a one-table database that is used to keep track of all of your customers and what they purchase from your store:
Everytime a new row is inserted into the table, all columns will be be updated, thus resulting in unnecessary "redundant data". For example, every time Wolfgang Schultz purchases something, the following rows will be inserted into the table:
id first last address city state
zip date item price
10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 032299 snowboard 45.00
10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 082899 snow shovel 35.00
10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 091199 gloves 15.00
10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 100999 lantern 35.00
10982 Wolfgang Schultz 300 N. 1st Ave Yuma AZ 85002 022900 tent 85.00
An ideal database would have two tables:1. One for keeping track of your customers 2. And the other to keep track of what they purchase:
105
An ideal database would have two tables:1. One for keeping track of your customers 2. And the other to keep track of what they purchase:
"Customer_info" table:
customer_ number firstname lastname address city state zip
"Purchases" table:
customer_ number date item price
Now, whenever a purchase is made from a repeating customer, the 2nd table, "Purchases" only needs to be updated! We've just eliminated useless redundant data, that is, we've just normalized this database!
Notice how each of the tables have a common "cusomer_number" column. This column, which contains the unique customer number will be used to JOIN the two tables. Using the two new tables, let's say you would like to select the customer's name, and items they've purchased.
106
Examples:
SELECT customer_info.firstname, customer_info.lastname, purchases.itemFROM customer_info, purchasesWHERE customer_info.customer_number = purchases.customer_number;
This particular "Join" is known as an "Inner Join" or "Equijoin". This is the most common type of "Join" that you will see or use.Notice that each of the colums are always preceeded with the table name and a period. This isn't always required, however, it IS good practice so that you wont confuse which colums go with what tables. It is required if the name column names are the same between the two tables. I recommend preceeding all of your columns with the table names when using joins.
SELECT employee_info.employeeid, employee_info.lastname, employee_sales.comissionFROM employee_info, employee_salesWHERE employee_info.employeeid = employee_sales.employeeid;
This statement will select the employeeid, lastname (from the employee_info table), and the comission value (from the employee_sales table) for all of the rows where the employeeid in the employee_info table matches the employeeid in the employee_sales table.
107
Assignment 7:
1. Write a query using a join to determine which items were ordered by each of the customers in the customers table. Select the customerid, firstname, lastname, order_date, item, and price for everything each customer purchased in the items_ordered table.
2. Repeat exercise #1, however display the results sorted by state in descending order.
108
•Judith Bowman, Sandra Emerson, and Marcy Darnovsky, The Practical SQL Handbook: Using Structured Query Language, Third Edition, Addison-Wesley, ISBN 0-201-44787-8, 1996.
•C. J. Date and Hugh Darwen, A Guide to the SQL Standard: A User's Guide to the Standard Database Language SQL, Fourth Edition, Addison-Wesley, ISBN 0-201-96426-0, 1997.
•C. J. Date, An Introduction to Database Systems, Volume 1, Sixth Edition, Addison-Wesley, 1994.
•Ramez Elmasri and Shamkant Navathe, Fundamentals of Database Systems, 3rd Edition, Addison-Wesley, ISBN 0-805-31755-4, August 1999.
•Jim Melton and Alan R. Simon, Understanding the New SQL: A Complete Guide, Morgan Kaufmann, ISBN 1-55860-245-3, 1993.
•Jeffrey D. Ullman, Principles of Database and Knowledge: Base Systems, Volume 1, Computer Science Press, 1988.
Additional Reference:
109
Reference websites:
-The End-
•Wikipedia: SQL - http://en.wikipedia.org/wiki/SQL •History and overview of the language. • SQLCourse - http://www.sqlcourse.com/ •Interactive/On-line SQL Tutorial with SQL Interpreter & live practice database. • A Gentle Introduction to SQL - http://sqlzoo.net/ •An Introduction to Database Normalization - http://dev.mysql.com/tech-resources/articles/intro-to-
normalization.html •Online SQL tutorial featuring a live interpreter to test SQL commands. SQL Tutorial - http://www.firstsql.com/tutor.htm •Complete SQL Tutorial using SQL92. SQL Tutorial - http://www.1keydata.com/sql/sql.html •This site aims to teach beginners the building blocks of SQL. Database and SQL eLearning - http://db.grussell.org/ •Database theory and an online tutorial interface to an Oracle database system, allowing a user to
learn SQL interactively. The site automatically checks and marks SQL and gives instant feedback. SQL exercises - http://www.sql-ex.ru •Introduction to Structured Query Language -
http://riki-lb1.vet.ohio-state.edu/mqlin/computec/tutorials/SQLTutorial.htm •SQL for Web Nerds - http://eveander.com/arsdigita/books/sql/ •A nicely structured manuscript on SQL by Philip Greenspun, based on the Oracle database.
Queries, transactions, triggers, and RDBMS concepts are covered. SQL School - http://www.w3schools.com/sql/ •http://directory.google.com/Top/Computers/Programming/Languages/SQL/
FAQs,_Help,_and_Tutorials/