connecting with computer science, 2e chapter 6 database fundamentals
Post on 20-Dec-2015
237 views
TRANSCRIPT
Connecting with Computer Science, 2e
Chapter 6Database Fundamentals
Connecting with Computer Science, 2e 2
Objectives
• In this chapter you will:– Consider the widespread use of databases– Take a brief tour of database development history– Learn basic database concepts– Be introduced to popular database management
software– See how normalization makes your data more
organized
Connecting with Computer Science, 2e 3
Objectives (cont’d.)
• In this chapter you will (cont’d.):– Explore the database design process– Understand data relationships– Gain an understanding of Structured Query Language
(SQL)– Learn some common SQL commands
Connecting with Computer Science, 2e 4
Why You Need to Know About...Databases
• Data must be organized
• Effective computer professionals know correct database design– Normalization
• Ensures an accurate and reliable database
– Structured Query Language (SQL)• Describes how information is retrieved from relational
database
Database Applications
• Database– Data logically related and organized into a file or set
of files to allow access and use
• Database applications– Student grading and library inventory– Genealogy studies and Social Security payments– Real estate sales, video store rentals, and retail sales– Space shuttle missions
• Database development– Essential part of computer professional’s daily life
Connecting with Computer Science, 2e 5
Brief History of Database Management Systems
• 1970 to 1975– Work of IBM employees E. F. Codd and C. J. Date
• Created theoretical model for designing data structures• Model became foundation for database design
– Software for organizing and sorting data• System R by IBM and Ingres by UC-Berkeley• Structured Query Language (SQL)• SQL: database standard• Database management system (DBMS) for PCs
• 1978– C. Wayne Ratliff of Martin Marietta develops Vulcan
Connecting with Computer Science, 2e 6
Connecting with Computer Science, 2e 7
Brief History of Database Management Systems (cont’d.)
• 1980 to the present – Vulcan renamed dBASE II (no dBase I)– Popularity of dBASE II inspires other companies
• Paradox, Microsoft Access, and FoxPro
– Databases become essential for business• Corporate decision making
• Systems: inventory management to customer support
Connecting with Computer Science, 2e 8
Table 6-1, Popular database management systems
Brief History of Database Management Systems (cont’d.)
Connecting with Computer Science, 2e 9
Database Management System Fundamentals
• Six main DBMS functions– Manage database security– Manage multiple users’ access to the database– Manage database backup and recovery– Ensure data integrity– Provide an end-user interface to the database– Provide a query language allowing users to modify
and view database information easily
Connecting with Computer Science, 2e 10
Database Concepts
• Basic database elements– Database: collection of one or more tables (entities)– Table or entity: divided into rows and columns– Row (record or tuple): collection of columns– Column (field or attribute): represents specific
information– Domain: set of possible column values
Connecting with Computer Science, 2e 11
Figure 6-1, A database table consists of rows and columns
Database Concepts (cont’d.)
Connecting with Computer Science, 2e 12
Indexes
• Special files occupying their own space– Specify columns determining how information stored
in a table can be accessed more efficiently• Examples: music database and the telephone book
• Advantages– Flexibility: many different columns to sort against– Searching and retrieval speeds up
• Disadvantages– Extra storage space– Updating takes longer
Connecting with Computer Science, 2e 13
An Example of Indexing
Figure 6-2, You use database concepts in your everyday life
Connecting with Computer Science, 2e 14
An Example of Indexing (cont’d.)
• Each database row has similar attributes
• Sort key: one or more columns used to determine the data’s sort order– One key or a combination of keys determines sort
order
• Database information is stored in natural or sequential order– Order of records displayed equals the order records
are entered
Connecting with Computer Science, 2e 15
Figure 6-3, Database records sorted by using the UPC column as a key
An Example of Indexing (cont’d.)
Connecting with Computer Science, 2e 16
Figure 6-4, Database records sorted by Brand_Name and Description
An Example of Indexing (cont’d.)
Connecting with Computer Science, 2e 17
Normalization
• Set of rules dictating database design– Eliminates duplication and inconsistencies– Process: sequence of stages called normal forms
• Five normal forms
• Third normal form provides sufficient structure
• Three database design problems solved– Representation of certain real-world items– Redundancies (repetitions) in data– Excluded and inconsistent information
Connecting with Computer Science, 2e 18
Preparing for Normalization: Gathering Columns
• Make a list of all pertinent fields (columns or attributes)– Source of fields: end-user reports or song inventory– Write fields on the column list
• Review user-specified input forms– Convert each field from the report to column in table
Connecting with Computer Science, 2e 19
Figure 6-5, End-user report with table columns highlighted
Preparing for Normalization: Gathering Columns (cont’d.)
Connecting with Computer Science, 2e 20
Figure 6-6, Additional table columns can be gleanedfrom input forms
Preparing for Normalization: Gathering Columns (cont’d.)
Connecting with Computer Science, 2e 21
Preparing for Normalization: Gathering Columns (cont’d.)
• Reconcile fields in report to column list
• Create tables of columns by combining associated fields– Logically group related information
• Example: information on artist and song files
• Gather data to create physical music database
Connecting with Computer Science, 2e 22
First Normal Form
• Unnormalized table– Row-column intersection with two or more values
• First normal form (1NF) eliminates redundancies– Create new record for duplicated column– Fill in blanks so all columns in record have a value– Columns with duplications: Album_Num,
Album_Name, Artist_Code, Artist_Name, Media_Type, and Genre_Code
• Remaining redundancies are addressed later
Connecting with Computer Science, 2e 23
Second Normal Form
• Next steps:– Assign a primary key to the table– Identify functional dependencies within the table
• Primary key (PK)– Column or combination of columns (composite)
uniquely identifying a row within a table• Examples: Student ID or Artist_Code
Second Normal Form (cont’d.)
• Determinant: column(s) used to determine value assigned to another column in the same row– Example: Artist_Code determinant for Artist_Name
• Functional dependency: column’s value dependent on another column’s value– Each value of first column is matched to single value
in second • Example: Artist_Name functionally dependent on
Artist_Code
• Composite key: primary key made up of more than one column
Connecting with Computer Science, 2e 24
Connecting with Computer Science, 2e 25
Second Normal Form (cont’d.)
• Second normal form (2NF)– First normal form and non-PK columns functionally
dependent on PK
• Creating 2NF– Determine columns not dependent on PK
• Remove columns and place in new table
– Default 2NF: table without composite PK
• Primary 2NF benefit: save disk space
Connecting with Computer Science, 2e 26
Figure 6-10, 2NF: Remove any columns that aren’t dependent on the composite primary key and create a new table
Second Normal Form (cont’d.)
Connecting with Computer Science, 2e 27
Third Normal Form
• Third normal form (3NF)– Eliminate transitive dependencies
• Column dependent on another column not part of PK
• Example: Genre_Desc depends on Genre_Code
– Each nonkey field should be a fact about the PK
Connecting with Computer Science, 2e 28
Figure 6-11, Songs table with the Genre_Desc column added
Third Normal Form (cont’d.)
Connecting with Computer Science, 2e 29
Third Normal Form (cont’d.)
• Creating 3NF– Remove transitive dependencies– Place removed columns in new table
• Primary 3NF benefit: eliminates repetition and saves disk space
Connecting with Computer Science, 2e 30
Figure 6-12, Songs and Genre tables in 3NF
Third Normal Form (cont’d.)
Connecting with Computer Science, 2e 31
Figure 6-13, Eliminating repetition saves storage space
Third Normal Form (cont’d.)
Connecting with Computer Science, 2e 32
The Database Design Process
• Six steps to designing a normalized database
• Example:– Creation of student-grading system
Connecting with Computer Science, 2e 33
Step 1: Investigate and Define
• Investigate and research modeled information
• Define purposes and uses of the database
• Use any documents end users work with to complete tasks
• Involve end users in design process– Example: student-grading system based on a course
syllabus
Step 2: Make a Master Column List
• Create list of fields for information• Possible field properties:
– Field name– Data type (char, varchar, number, date, etc.)– Length– Number of decimal places (if any)
• Review users’ documents for fields – Forms and reports good source for fields
• Example fields: Student ID, First Name, Last Name, E-Mail, Grade Level, Grade Level Description, Homework Average, Quiz Average, Test Average
Connecting with Computer Science, 2e 34
Connecting with Computer Science, 2e 35
Step 3: Create the Tables
• Logically group defined columns into tables– Heart of design process – Relies heavily on normalization rules
• Main rules in database design: 1NF – 3NF – Table in 3NF: well defined
• Normalizing databases– Like cleaning a closet
Connecting with Computer Science, 2e 36
Figure 6-14, Tables created for the student-grading system
Step 3: Create the Tables (cont’d.)
Connecting with Computer Science, 2e 37
Step 4: Work on Relationships
• Relationship: defines table relations– Two types of relationships discussed in this chapter
• One-to-many (1:M)
• One-to-one (1:1)
• Primary and foreign keys are defined in each of the tables– Primary key (PK): determinant discussed earlier – Foreign key (FK): column in one table is PK in
another
Connecting with Computer Science, 2e 38
Step 4: Work on Relationships (cont’d.)
• One-to-many (1:M) – Most common relationship – Each record in Table A relates to multiple records in
Table B – Requires that FK column(s) in “many” table refer to
PK column in “one” table• Example:
– Grades Table to Student Table
Copy editor:
Chg. “multiple” to “zero to many”?
Copy editor:
Chg. “multiple” to “zero to many”?
Connecting with Computer Science, 2e 39
Figure 6-15, The relationship of Student to Course Grade is one-to-many (1:M)
Step 4: Work on Relationships (cont’d.)
Connecting with Computer Science, 2e 40
Step 4: Work on Relationships (cont’d.)
• One-to-one (1:1) – For every record in Table A, there can be one and
only one matching record in Table B– Consider combining tables in 1:1 relationship– Normally the two tables can be combined into one
table
Connecting with Computer Science, 2e 41
Figure 6-16, The relationship of Student to Grade Level is one-to-one (1:1)
Step 4: Work on Relationships (cont’d.)
Connecting with Computer Science, 2e 42
Step 5: Analyze the Design
• Analyze work completed– Search for design errors and refine tables as needed– Follow normalization forms (ideally to 3NF) – Correct any violations
• ER models– Visual diagram composed of entities and relationships
• Entities represent the database tables
• Relationships show how tables relate to each other
• Cardinality– Shows numeric relations between entities
Connecting with Computer Science, 2e 43
Step 5: Analyze the Design (cont’d.)
• Types of cardinality (and their notation) include: – 0..1, 0:1 (zero to one)– 0..M, 0:N, 0..*, 0..n (zero to many)– 1..1, 1:1 (one to one)– 1..M, 1:M, 1:N, 1..*, 1..n (one to many)– M..1, M:1, N:1, *..1, n..1 (many to one)– M..M, M:M, N:N, *..*, n..n (many to many)
• Example:– ER model for the student-grading system
Connecting with Computer Science, 2e 44
Figure 6-17, The student-grading system ER model in Visio
Step 5: Analyze the Design (cont’d.)
Connecting with Computer Science, 2e 45
Step 6: Reevaluate
• Reevaluate database performance– Ensure database meets all reporting and form needs– Include the end users– Explain each table and field being used– Make sure fields are defined to users’ requirements– Manipulate data structure with SQL commands
Connecting with Computer Science, 2e 46
Structured Query Language (SQL)
• Structured Query Language (SQL) functions– Manipulate data– Define data– Administer data
• Many different “dialects” of SQL
Connecting with Computer Science, 2e 47
Structured Query Language (SQL) (cont’d.)
• SQL advantages– Reduces training time (syntax based in English)– Makes applications portable (SQL is standardized)– Reduces the amount of data being transferred– Increases application speed
• Following sections show basic SQL commands– Creating tables– Adding (inserting) rows of data– Querying table to select certain information
Connecting with Computer Science, 2e 48
Figure 6-18, A sample SQL statement and results
Structured Query Language (SQL) (cont’d.)
Connecting with Computer Science, 2e 49
CREATE TABLE Statement
• CREATE TABLE statement: make a new table– NULL/NOT NULL
• Optional property indicates whether data required
• Syntax:
Connecting with Computer Science, 2e 50
CREATE TABLE Statement (cont’d.)
• SQL statements to create table called Songs:
Connecting with Computer Science, 2e 51
INSERT INTO Statement
• INSERT INTO statement: add new rows of data– Requires a table name– Square brackets ([…]) specify optional columns– Columns are on separate lines for readability
• Syntax: INSERT INTO table_name [(column1,column2, . . . )]
VALUES (constant1, constant2, . . . )
Connecting with Computer Science, 2e 52
Figure 6-19, SQL INSERT INTO statement to add a record to the Songs table and its result
INSERT INTO Statement (cont’d.)
Connecting with Computer Science, 2e 53
SELECT Statement
• Retrieves data from one or more tables
• Syntax:
• Specified order determines order of retrieval/display
Connecting with Computer Science, 2e 54
Figure 6-20, SQL SELECT statement to return the name, media type, and track number for songs
SELECT Statement (cont’d.)
Connecting with Computer Science, 2e 55
WHERE Clause
• Specifies additional criteria for retrieving data– Fields should be included in fields selected
• AND and OR keywords– Allow specification of multiple search criteria– AND indicates that all criteria must be met– OR indicates only one criterion needs to be met
Connecting with Computer Science, 2e 56
Figure 6-21, SQL SELECT statement with a WHERE clause and the results
WHERE Clause (cont’d.)
Connecting with Computer Science, 2e 57
Figure 6-22, More descriptive SQL SELECT statement with a WHERE clause
WHERE Clause (cont’d.)
Connecting with Computer Science, 2e 58
Figure 6-23, SQL SELECT statement with a WHERE clause and AND versus OR
WHERE Clause (cont’d.)
Connecting with Computer Science, 2e 59
ORDER BY Clause
• Permits user to change how the data is returned– Makes for a more meaningful presentation
• Default: data returned in sequential order
• User can specify the ORDER BY column name(s)
• Also returns data in ascending (default) or descending order
Connecting with Computer Science, 2e 60
Figure 6-24, SQL SELECT statement with an ORDER BY clause and the results
ORDER BY Clause (cont’d.)
Connecting with Computer Science, 2e 61
Figure 6-25, SQL SELECT statement, using an ORDER BY clause with the default ascending option
ORDER BY Clause (cont’d.)
Connecting with Computer Science, 2e 62
Figure 6-26, SQL SELECT statement, using an ORDER BY clause with the DESC option
ORDER BY Clause (cont’d.)
Connecting with Computer Science, 2e 63
ORDER BY Clause (cont’d.)
• Many more options– Specified on SELECT statement
• Many more SQL commands used– Maintaining, defining, and administering data found
within a database
Connecting with Computer Science, 2e 64
One Last Thought
• Poorly organized database– More of a hindrance than a benefit
• Careful planning required– How data will be structured and stored– Applying normalization rules– Using the right SQL statements to extract the kind of
information desired
• Database – Powerful tool used in many areas
• Including business and computing
Connecting with Computer Science, 2e 65
Summary
• Database– Collection of logically related records
• DBMS– Software used to design, manage, and interface with
databases
• Indexes– Files that revise default sequential order of data
• Normalization– Process of removing data redundancies
Connecting with Computer Science, 2e 66
Summary (cont’d.)
• Data is normalized with five normal forms– First three normal forms most important
• Primary key– Uniquely identifies table entries
• Foreign key– Primary keys in other tables
• Entity relationship model– Visual diagram of tables and relationships
Connecting with Computer Science, 2e 67
Summary (cont’d.)
• 1:M and 1:1 notations indicate cardinality
• Six-step database design process
• Structured Query Language (SQL)– Manipulates, defines, and administers data
• Basic SQL statements– CREATE TABLE, INSERT INTO, SELECT