introduction to databases “when i use a word,” humpty dumpty said in rather a scornful tone,...
TRANSCRIPT
Introduction to
Databases
“When I use a word,” Humpty Dumpty said in rather a scornful tone,
“it means just what I choose it to mean - neither more nor less.”Lewis Carroll, Through the Looking Glass
Class Outline
What is data and why is it important? What is a database and database schema? What is a database management system? What is a database application and what are its
components? What are the levels of database representation? What were the limitations of the systems that led to the
development of the current relational database systems? What are various types of database systems? What is a table, file and record?
When do I use a Database program?
Word processing
Spreadsheet
Text handling excellent fairMathematical functions poor excellent
Ease of Use excellent good
Training Cost low moderate
Software Cost low moderate
Volume of data low moderate
Multiuser Access low moderate
Database
poor
very good
fair
high
high
very high
very high
Word processing
Spreadsheet
Text handling excellent fairMathematical functions poor excellent
Ease of Use excellent good
Training Cost low moderate
Software Cost low moderate
Volume of data low moderate
Multiuser Access low moderate
Database
poor
very good
fair
high
high
very high
very high
Principles of Information Resource Management
Organizational resources flow into and out of the organization Two types of major organizational resources: Physical resources,
Conceptual resources (data & information) As scale of organization grows, it becomes increasingly difficult
to manage by observation (i.e., reliance on conceptual resources) Conceptual resources can be managed just like physical resources
or assets (e.g., employees, $$, equipment, widgets, etc.) Management of data & information means getting it before it’s
needed, protecting it, assuring quality, and getting rid of it when no longer required
Management of data & information can be achieved only through organizational commitment
Adapted from McFadden, F.R. & Hoffer, J.A. (1994). Modern Database Management. Redwood City, CA:Benjamin/Cummings Publishing (p. 6)
Information is a major organizational resourcepr
oces
sing
Jane bought 30 in JulyJane bought 20 in Aug
John bought 50 in JulyJohn bought 10 in AugData
(isolated facts)
Average/ July is 40
Average/ Aug is 15Information(organized data)
Sales have dropped between July and August
Knowledge
Survey customers; invest in advertising; cut costs, expand product line
Action
What is a Database?
Organized collection of related information or data stored on a computer disk for easy, efficient useOrderNum InvoiceAmt CustomerName Ow nerName Phone
201 854.00 Cottage Grill Ms. Doris Reaume (616) 643-8821
209 1,106.00 Cleo's Dow ntow n Restaurant Ms. Joan Hoffman (616) 888-2046
214 1,070.50 Jean's Country Restaurant Ms. Jean Brooks (517) 620-4431
221 1,607.00 Maxw ell's Restaurant Ms. Barbara Feldon (219) 333-0000
235 1,004.50 Embers Restaurant Mr. Clifford Merritt (219) 816-2456
239 1,426.50 The Empire Ms. Curtis Haiar (616) 762-9144
Outstanding Invoice Amounts By Order
201
209
214
221
235
239
data
information
(e.g., Oracle, dBase,
Access, Paradox)
Database DBMSDatabase
Application
user
What is a Database Management System (DBMS)?
“A set of programs used to define,administer, and process the database and its applications conveniently and efficiently”Program (or collection of programs) that enables users to create the database. The DBMS manages the storage and retrieval of data, and provides the user with certain functionalities to guarantee that the data will be logically organized and consistently applied.
What is a Database Application?
A computer program that performs a specific task of practical value in a business situationAn interface that allows the user to enter and manipulate data; User can request abstract views of dataCreated by database designers and developers using a DBMS program or a programming language
Database
DBMS
Database application
Major Components of a Database Application
3. Query- asks questions of data
1. Form- data entry 2. Report- summarizes & prints
5. Program - used to automate a database
4. Menu - organizes components
Features of a DBMS
DBMS
Engine
Database• user data• metadata• indexes• application
metadata
Design Tools Subsystem• Table Creation Tool• Form Creation Tool• Query Creation Tool• Report Creation Tool• Procedural Language
Compiler
Applicationprogram
developer
Run Time Subsystem• Form Processor• Query Processor• Report Writer• Procedural Language
RunTime Applicationprogram
users
DBMS
Types of Database Systems
Distributed >1 site, requires networknot widely adapted yet due to many problems
Centralized (single site)microcomputer (desktop)legacy mainframe/ mini computer (1
CPU)client/server architecture (>1 CPU)
our focus;centralized,micro-computerdatabase
Type Example# of concurrent
users
PersonalJoe's House Painting Service
1
Workgroup Video rental store < 25
Organizational (enterprise)
Larger Corporations or Government
hundreds
Multimedia (Internet technology)
Holiday resort bookings (with photos)
possibly hundreds
Typical size of database
Any
> 1 Trillion bytes
< 100 Megabytes
< 10 Megabytes
Three levels of Database Representation
database
design,
logical,
abstract
description of
data elements
& their
relationships
Conceptual level
physical implementation
- access methods, index construction,
data structure; database exists in reality only
here
Internal level
Primary focus of the lectures of this course is the conceptual level because the creation of a database begins with its design; the focus of the laboratories
is the external level, using a RDBMS, which manages the internal level.
External level
each user
group will
have its own
view of the
database;
database is
accessed from
here
Focus of this course
Lectures Conceptual design of
databases: determining their purpose, developing a model, identifying the tables that are required, designing normalized tables and identifying their relationship to one another.
Laboratories Implement a database at
the external level: create databases (tables) and database applications (queries, forms, reports, programs) using a typical microcomputer relational database management system, MS Access 97.
The Database System Environment
Hardware - physical devices computer, peripherals, network devices
Software DBMS (manages the database) operating systems software (manages hardware & software) application programs (user access and manipulate database)
People system administrators (manage general operations) database designers (architects of database structure) database administrators (ensure the database is functioning) systems analysts & programmers (design & implement database) end users (use application programs)
Procedures - rules of the company governing use of data Data
you are here
In the beginning…(in the 1950s)
…There were no databases. Just file (or data processing) systems.
Name: Jane Doe Address: 123 Easy St.City: LondonPhone: 455-0897
Date: Sept 14, 1955Time: 2:00 p.m.Patient: Jane Doe, 455-0897OHIP: 123456789
File systems were typically organized by function (use)
The first data management systems performed clerical tasks (transactional processing) such as order entry processing, payroll, work scheduling.
e.g., files for patients (file folder analogy); each record for a single patient; another file for appointment/ billing information
Limitations of Data File Systems
Worked adequately if data collection needs were relatively small. Problems arose as data files, information needs, and reporting
requirements grow in complexity due to: Extensive programming - use of third-generation languages (e.g.,
COBOL, FORTRAN) in which the programmer must specify what is be done as well as how it is to be done
Customerfile
CustomerprocessingApplication
Orderfile
OrderprocessingApplication
Limitations of Data File Systems
Poor mechanisms for sharing data across organization - files
are often incompatible with one another (separate, isolated data)
Data redundancy - duplicate information in two or more files
Program/ data dependence - if the file structure changed,
ALL programs using the file had to be modified - time-
consuming
Lack of flexibility - could not do ad hoc queries or reports;
required separate programs for every report or query
Poor security - difficult to program, therefore, often omitted
Difficulty of representing data in the users’ perspective
Historical Roots of Database Systems
Developed to overcome limitations of file systems, developed initially on mainframe computers in late 60s and early 70s - a typical early DBMS cost $100,000 (many are still in use)
First general databases were created for General Electric Company (GEC) - Integrated Data Store (IDS), designed to run on GEC machines; B.F. Goodrich ported IDS to IBM 360 - became dominant until 1980s
As PCs gained popularity (1980s), single-user, personal databases developed; at present, most database technology is used in workgroups
DatabaseDBMS
CustomerprocessingApplication
OrderprocessingApplication
EmployeeprocessingApplication
`
`
Better Definition of a Database
A collection of users’ data, organized logically and managed by a unifying set of principles, procedures, and functionalities, which help guarantee the consistent application and interpretation of that data
(a) organized collection of related information or data stored on a computer disk for easy, efficient use; represented in tabular format
OrderNum InvoiceAmt CustomerName Ow nerName Phone
201 854.00 Cottage Grill Ms. Doris Reaume (616) 643-8821
209 1,106.00 Cleo's Dow ntow n Restaurant Ms. Joan Hoffman (616) 888-2046
214 1,070.50 Jean's Country Restaurant Ms. Jean Brooks (517) 620-4431
221 1,607.00 Maxw ell's Restaurant Ms. Barbara Feldon (219) 333-0000
235 1,004.50 Embers Restaurant Mr. Clifford Merritt (219) 816-2456
239 1,426.50 The Empire Ms. Curtis Haiar (616) 762-9144
Better Definition of a Database (cont'd)
(b) A database is self-describing (metadata or system catalogues or data dictionary) A database contains a description of its own
structure (e.g., the names of all the tables, the names and types of data in each column in all the tables)
Kroenke, D.M., Database Processing: Fundamentals, Design & Implementation, Prentice Hall, 1998
Better Definition of a Database (cont'd)
(c) Indexes are stored with the database Data accessed from a source table for sorting and searching is
time-consuming without a “pointer” system, which improves performance and accessibility of the database
The “overhead cost” of indexing is that each time data is updated, all indexes must also be updated, therefore, reserve index for cases in which they are needed
Salesperson
Employee ID Name Office27 Rodney Jones Toronto44 Goro Azuma Tokyo35 Francine Moire Brussels37 Anne Abel Tokyo
Office Index
Office Employee IDToronto 27Tokyo 44, 37Brussels 35
(d) Application Metadata - stores structure and format of application components; not all DBMS support this feature
Evolution of Database Models
Hierarchical
Network
Relational
Semantic
Object-Oriented
Object-Relationa
l
still in use in many older (1970s) legacysystems; very few new databases;referred to “navigational systems”
the vast majority currently use this, therefore, our course’s focus is here
Very few new databases are being created using Object-Oriented Programming (not many ODBMS for businesses to implement this model)
The Relational Database Model
Agents
Clients Entertainers
Engagements Entertainer styles
Instruments
represented by tables (like spreadsheets) tables are NOT linked with physical pointers unlike earlier systems, all three types of relationships can be
represented accommodates the design of larger databases that involve
complex relationships and intricate manipulations
But #1 problem still is
Evaluation of the Relational database model
Advantages mechanisms for minimizing data redundancy and inconsistency logical database design is separated from physical aspects relatively program-data independent management of data for access, manipulation, and security flexible mechanisms for generating reports and queries program development and maintenance costs are reduced data can be accessed in a multiplicity of ways within and
amongst organizations
Disadvantages ease of use - many untrained people create and use databases
without considering its design - usually incorporate many errors
Comparison of Database models
Hierarchical, Network DBMS• data independence• structural dependence• demands upon programmer
Relational DBMS• data independence• structural independence• demands upon computer
File Systems• data dependence• structural dependence• demands upon
programmer
table = file = relation
Table
Users view their data in two-dimensional tables.
column = field = attribute
Field
The fields within records contain data.Data within a field must be of the same data type. Each field within a table must have a unique name. Order of fields is unimportant.
row = record = tuple
Record
A record is a group of related fields of information about a single instance of one object or event in a database.
Tables consist of zero, one, or more records. Order of rows is unimportant.
Database Schema
Database schema defines database’s structure, tables, relationships, domains, and constraint rules
Tables BOOK (ISBN, Title, AuthID, PubID, Price) PUBLISHER (PubID, PubName, PubPhone) AUTHOR (AuthID, AuthName, AuthPhone)
Relationships Each book is published by one and only one publisher Each publisher publishes one or more books
Domains (set of values in a column) Physical description (e.g., set of integers 0 < x < 99999)
Constraints (business rules) Price cannot be less than zero; Author phone field cannot be left
blank