11 intro to databases part i is 340 by chandra s. amaravadi
TRANSCRIPT
2
IN THIS PRESENTATION..
Data vs information Database organization
file, record, attribute/field DBMS activities and functions DBMS components Importance of databases
4
DEFINING DATA AND INFORMATION
Data: Data consists of raw facts e.g. 298-2034Information: Data presented in a context e.g. someone’s phone#Some more examples:a) IS340a) IS340 is a core course for BBA.
b) Joan Smithb) Joan Smith is an arts & sciences student
c) Score was 94-72c) Score in the Celtics-Bulls game was 94-72.
5
DATA IS NOT USUALLY ISOLATED
Example information about a person
Name: Steve NashDOB: 2-4-1965 Profession: Basket ball playerAddress: 415 Walnut street.
These describe a personmore. They are called ????.
6
ANOTHER EXAMPLE
Example information about a credit card transaction:
Date: 12/5/08Time: 4:48 pmMerchant: WIU UnionCard type: Master CardIssuer: First Bank
These describe a transactionmore.
7
ONE ATTRIBUTE HAS A UNIQUE VALUE
One of these is always unique:
Acct#: 4555950Date: 12/5/08Time: 4:48 pmMerchant: WIU UnionCard type: Master CardIssuer: First Bank
The unique attribute iscalled ???
8
CHARACTERISTICS OF DATA/INFORMATION
WHAT CAN WE SAY ABOUT THE TWO TYPES OF DATA?
Transaction#: 55643Date: 12/5/08Time: 4:08 pmMerchant: WIU UnionCard type: Master CardIssuer: First Bank
Transaction#: 55644Date: 12/5/08Time: 4:20 pmMerchant: VitalesCard type: Master CardIssuer: Midwest Bank
9
CHARACTERISTICS OF DATA/INFORMATION…
Transaction
ID
Transaction
Date
Transaction
Time
Merchant
55643 12/05/08 4:08 pm WIU Union
55644 12/05/08 4:20 pm Vitales
Data is typically organized into ??Transaction#: 55643Date: 12/5/08Time: 4:08 pmMerchant: WIU UnionCard type: Master CardIssuer: First Bank
10
Transaction
ID
Transaction
Date
Transaction
Time
Merchant
55643 12/05/08 4:08 pm WIU Union
55644 12/05/08 4:20 pm Vitales
HISTORICAL APPROACH TO DATA
55643 12/05/08 4:08pm WIU Union55644 12/05/08 4:20pm Vitales55644 12/05/08 4:20pm Vitales55646 01/07/09 8:05am Pizza Hut
55643 $25.60 MasterCard 55644 $38.00 Visa55644 $63.00 Visa55646 $55.00 American Express
This is called ___________ __________.
13
Database - Organized collection of information DBMS - SW program to manage and provide
controlled access to the data
DEFINITIONS
14
Database – A group of related filesFile/table – A group of related recordsRecord – a grouping of related fieldsField – a grouping of characters (one row)
Schema -- This is the logical view of the database (tables and fields)Primary key – An attribute whose values are unique within a file
Database
Files/tables
Records
Fields
DBMS ORGANIZATION
15
Primary key
Record
Attr. values
Attributes/field namesStructure/schema
DBMS ORGANIZATION
PROD# DESCR. PRICE QUANTITY
IR888E Refrigerator $1,800 20
TS3233 Television $67 32Record
17
DBMS ACTIVITIES & FUNCTIONS
Define structure / schema
Enter data
modify data
query data
get reports
Activities with DBMS
18
Define structure/schema
DBMS ACTIVITIES & FUNCTIONS
Field Name Data type Description Length Decimals
Prod# Numeric Unique prod code
6 0
Descr Text Short prod
description
25 0
Price Currency Product price 6 2
19
Enter data -- Create a data entry form
PRODUCT DATA ENTRY
Product #:
Description:
LabelField
Heading
Data entry form
DBMS ACTIVITIES & FUNCTIONS
21
Query datalist products costing more than $2,000
Prod# Descr. Price Quantity
> 2,000
Query by example form
DBMS ACTIVITIES & FUNCTIONS
22
RH
PH
Detail
FieldsPF
RF
PRODUCT LISTING
PRODUCT # DESCR. PRICE
Product # Descr. Price
Average Price
Title
Column
Headings
Footer
A report specification
23
GETTING DATA OUT..
PRODUCT LISTING
PROD# DESCRIPTION PRICE
M100 Chair $ 50.00 M150 Table $200.00
Average Price $153.00
A generated report
24
Activities with DBMS◦Define structure / schema◦ Enter data◦modify data◦ query data◦ get reports
DBMS components
D B M S Kernel
D B M S Kernel
Major Components of DBMS
Export/Import
Data Defn.Data Defn. SQLSQL
Prog. Language Interface
Prog. Language Interface
DataDiction-
ary
DataDiction-
ary
Screen/ReportGen.
Screen/ReportGen.
Appln.Gen.
Appln.Gen.
DBMS COMPONENTS..Data definition – the facility through which schema is defined. (how new tables are created).SQL interface – the facility through which SQL commands are typed in.Programming language interface – the facility which processes SQL commands embedded in application program. Also known
as the host language interface. Data dictionary – the facility that records details about the schema, reports, data entry forms etc.Screen & reports- the facility through which data entry screens and reports are created.Appln. Generation- the facility through which applications are created.Export/Import -- the facility through which files can be imported/exported in different DBMS formats.DBMS Kernel -- the actual programs which interact with the O/S and carry out data I/O.
27
Suppose we have detailed information on each and every transaction in a store, what can we do with that?
What if a customer wants to return a shirt purchased in the store?
What if a manager wants to know what products were sold on a particular day?
USAGE OF DATA/INFORMATION
28
To store and record information e.g. bal, price, grades etc. To retrieve information e.g. check#432 cashed? To report information e.g. daily sales To answer queries e.g. how many shoes were sold?
IMPORTANCE OF DBMS’s
Databases are used:
To analyze trends Identify sales prospects
Advanced uses:
31
EVOLUTION OF DBMS
File processingsystems
Hierarchicalsystems
Relationalsystems
1 March 2002 Greece2 July 2003 Italy3 ……………..
Person
Photo1 Photo2Greece Italy
PhotoId
32
EVOLUTION OF DBMS..
FILE PROCESSING SYSTEMS:
Data in the form of “flat files” (ascii files) Each program had its own specific files Data was duplicated across files.
A system where data was managed by the programin the form of flat files.
33
EVOLUTION OF DBMS..PROBLEMS CAUSED BY FILE PROCESSING
Uncontrolled redundancy Poor data quality (a.k.a.?) Lack of data sharing
34
HIERARCHICAL MODEL..
Structure is hierarchical (not flat) There are one or more root segments segments are linked hierarchically
Difficult to write queries but very efficient (fast)!
35
HIERARCHICAL MODEL
NAME TOTAL_ENROLL NO_OF_PHD
ASNAME JOB_TITLE FNAME RANK DEGREE
(D) DEPARTMENT
(A) ADMIN-STAFF (F) FACULTY
What do you call this?
36
Data organized into tablesData retrieved by using SQL, 4th GL
joins tables by equating values of cross reference keys.
RELATIONAL MODEL
8895 Smith 4/16/90 $35,000
8896 Farley 4/22/91 $300
8897 Gomez 1/10/00 $2,000
ACCT
ACCT# NAME DT. OPENED BALANCE
39
THE DATABASE CONCEPT..
Organizations need to collect and manage information. Typically this need can be summarized in terms of entity classes and their relationships Known as a (conceptual) data model/ER model ER model is converted to database model Implemented
40
Integrated conceptualization Convert it to design Go through development
process
Different users/programs can share the information easily
Can access info via SQL Can get reports easilyDatabase
DBMS
Prog. 2Prog. 1
41
CUSTOMER ACCTS. LOANS
CROSS REFERENCE
REPORT
ACCT. DATA LOAN DATA
GETTING INFORMATIONFROM FILE PROCESSING
42
GETTING INFORMATIONFROM A DATABASE
ACCT# NAME DT. OPENED BALANCE
8895 Smith 4/16/90 $35,000
8896 Farley 4/22/91 $300
8897 Gomez 1/10/00 $2,000
ACCT
LID AMOUNT INT. RATE BALANCE
LOAN
9978 $6,000 6.0% $2,440
9978 $5,000 7.1% $5,000
9992 $1,000 8.5% $ 400
How canWe combinethese twoTables?
44
DATA MODELS -- BASIC CONCEPTSData models: representations of relationships among entity classes
Entity – Individual example of person, place or thing.Entity Class – Collection of related entities (eclass).Attributes – Properties of entity classes about which we would like to collect information.
Primary key -- An attribute whose values are unique within an entity class (pkey).
45
DISCUSSION
San Francisco Tires Customer Book ISBN# Neon Peoplesoft Cust. balance Microsoft Microsoft employees Date of Birth Invoice
Classify the following as entity, entity class or attribute
46
DATA MODELS…
Cardinality: Number of entities participating in a relationship
A B1:1
A B1:M
A BM:N
For each value of A, oneand only one value of Band vice versa.
For each value of A, multiple values of B, but for each B only one A.
For each value of A, many values of B and vice versa.
47
DISCUSSIONIdentify the types of the following relationships
company -- presidentinstructor -- studentsplane -- pilotscity -- convention centersdepartment -- employeesteam -- playerscompany -- citybooks – authorscity -- subdivisionsproduct -- warranty
48
AN EXAMPLE ER CHART
CUSTOMER
ORDERS
Places
Cust# Name
PRODUCTSAre for
ENTITY CLASS
ATTRIBUTE
RELATIONSHIP
Ord# Ord dt
49
Draw Data Models (ER models) corresponding to the followingSituations:
Students take courses from professors
Clients can book one or more properties with a rental agency. Eachproperty can be booked by more than one client (for different dates).
Trains arrive and depart from platforms in train stations, corresponding to the cities that they stop in. A city can have many Stations.
Airlines operate a number of flights which arrive and depart fromGates. There could be several flights from a gate, but each flightis assigned only one gate. Each airline operates a gate or a set of Gates which are leased to them by the airport.
DISCUSSION
50
Design a database for the following situation: Bellsouth, atelephone company needs a database to maintain records ofits automobiles, repairs, mechanics and special equipment neededto repair cars. Mechanics are assigned skill codes based on thetype of repairs they can carry out. For e.g. Class III mechanicscan perform reborings, engine overhauls and transmission repairs.The repair shops are located in 300 cities and towns throughout the state. Mechanics can be assigned to only one repair shop at atime. The application requires knowing what equipment is (fore.g. diagnostic equipment, hydraulic machinery) is at what locationand also the repairs carried out on each vehicle. For each vehiclethe application requires knowing its identification number, mileage and repair history.
DISCUSSION
52
CENTRALIZED DATABASES
The database is in one physical location. All applications regardless of whether the clients are located in Seoul or Detroit need to access data from that physical location. What are the limitations?
Centralized databases
53
DISTRIBUTED DATABASES
Replicated -- copies of the database maintained in multiple sites Partitioned -- database is physically divided into chunks
The data is physically stored in multiple geographical locations
55
PARTITIONED DATABASES..
Emp ID Emp Name
11049 Cathy 200 Meadowbrook
Emp Address
11051 Richard 13 E. Willow
11054 Hugh 1400 E. Washington
……. …….. …………………..
56
PARTITIONED DATABASES
The database is divided into chunks. What is the logical way of doing this? Different chunks are placed in different locations. All the locations are connected. What are the advantages and limitations?
The database is in multiple physical locations
57
MULTI-MEDIA DATABASES
Emp ID Emp Name Emp. Picture Emp Address
A database that is able to handle multiple data types: Conventional information Images Audio & video
Capable of storing, retrieving and updating this type ofinformation
Emp.jpg
60
DATA WAREHOUSE: Historical data organized for analysis and decision making.
DATA MART: smaller version of data warehouse, specialized by functional area e.g. marketing
ON-LINE ANALYTICAL PROCESSING (OLAP): organizes data into a “cube” for decision analysis.
DATA WAREHOUSES
61
HISTORICALINFORMATION
INTERNALDATASOURCES
EXTERNALDATASOURCES
OPERATIONAL,HISTORICAL DATA
DATA WAREHOUSE
EXTRACT,TRANSFORM
OLAP
DATA MINING
DATA WAREHOUSING & OLAP
62
DISCUSSION QUESTIONS
Are Databases and DBMSs one and the same? Does a record consist of files? What is the smallest unit of data in the database? Does data integrity refer to data quality? What does the detail line of a report consist of? Is an “employee” an entity class or an attribute? Is the data dictionary useful while creating the database? Why do we use SQL? Is it necessary to use SQL to produce database reports? If we have a number of diskettes in alphabetical
sequence, is this an example of a file organization? If so, what type? Would multi-national organizations prefer centralized databases?