chapter eight managing data resources file organization and concepts
Post on 20-Dec-2015
230 views
TRANSCRIPT
Chapter Eight
Managing Data Resources
File Organization and Concepts
The Data Hierarchy
Human Resources DatabasePersonnel Table
Database Payroll Table
TableJames Thompson 36 220 Larkin Drive Anna Harvey 24 63 Whalen Street
Record James Thompson 36 220 Larkin Drive
Field James Thompson (name field)
Byte 1011 10010
Bit 1, 0
File Organization Terms• Field - a grouping a characters into a word, a
group of words, or a complete number, such as a person’s name and age. (Key field - uniquely identifies)
• Record - A group of related fields.• File - a group of records of the same type• Entity -a person, place,thing, or event about which
information must be kept• Attribute - A piece of information describing a
particular entity.
Illustration of a Record
Entity = ORDERAttributes
Order Order Item Quantity AmountNumber Date Number
4340 2/8/99 1583 2 17.4 FIELDS
KEY FIELD
Accessing Records from Computer Files
• Sequential File Organization– a method of storing data records in which the
records must be retrieved in the same physical sequence in which they are stored
• Direct/Random File Organization– a method of storing data records in a file so that
they can be accessed in any sequence without regard to their actual physical order on the storage media
Data Access Methods
• Indexed Sequential Access Method– a file access method to directly access records organized
sequentially using an index of key fields
• Direct File Access Method- – a method of accessing records by mathematically
transforming the key fields into specific addresses for the records
• Transform Algorithm– a mathematical formula used to translate a record’s key field
directly into the record’s direct physical storage location
Indexed Sequential Access Method
Record230 Data
Cylinder IndexCylinder Highest Key
1 2002 3923 500. .. .
Track Index Track Index Track Index Cylinder 1 Cylinder 1 Cylinder 1Track # Highest Key Track # Highest Key Track # Highest Key
1 9 1 208 1 3992 19 2 238 2 4193 28 3 260 3 4404 39 4 279 4 468
Direct File Access Method
• Basically uses an algorithm within a disk stack to access the record
• Example - File Size -1000 records– take the prime number closet to the max
number of records (in this case 997), divide the key field number by this prime number and the remainder is the address location number.
A Disk StackAccessmechanism
10 Accessarms
11 disks
000 403
404 Tracks
20 Read/writeheads
20 Tracks
2 0 9 0 7 0 0 3
Track number
Read/write head number
Record number
A DASD AddressA DASD Address
Traditional File Environment
• A way of collecting and maintaining data in an organization that leads to each functional area or division creating its own data files and programs. Can be either flat or data approached.
• Problems with TFE– data redundancy, program data dependence,
inflexibility, poor data security, inability to share
SALES NO
112
128
153
159
162
166
NAME
ADAMS
WINKLER
HOUSE
FRANCIS
WILLIS
GROVETON
SALES NO
112
128
153
159
162
166
TERR
1
3
2
1
1
2
An Example of Two Flat FilesAn Example of Two Flat Files
Table A Table B
Problems with TFE Explained
• Data Redundancy– presence of duplicate data in multiple data files
• Program-Data Dependence – the close relationship between data and programs. Any update
in the data requires a format change within the software (ZIP/AREA CODE ex)
• Inflexibility - cannot produce custom reports• Security - with little control of data, hard to control
access• Data sharing - with data in many places hard to
share b/c you do not know where it is
The Database Concept• Database definition
An integrated collection of computer data, organized and stored in a manner that facilitates easy retrieval.
• Data independence -- keep data specifications separate from programs, in tables and indexes
Database Management System(DBMS)
• The special software to create and maintain a database and enable individual business applications to extract the data they need without having to create separate files or data definitions in their computer programs– a DBMS has three components– a data definition language– a data manipulation language– a data dictionary
Hierarchical Database
Custormers
Kathy Cindy Debra
Dog food
Cat food
Bird food
Dogfood
Bird food
.. ..
Customer
Order
Network Database
Custormers
Kathy Cindy Debra
Dog food
Cat food
Bird food
Dogfood
Bird food
.. ..
Customer
Order
Dog Food
Relational Database
• Table format• Easy access (query)
Customer(CustomerID, Name, OrderID,…)Order(OrderID, CustomerID, OrderDate,…)ItemsOrdered(OrderID, ItemID, Quantity)
CustomerCustomerID
Name
..
OrderOrderID
..…
ItemsOrderedOrderID
..
..
Database Differences
Comparison of Database AlternativesDB Type Processing Flexibility End-User Programming
Efficiency Friendliness ComplexityHierarchical high low low high
Network medium-high low-med low-moderate high
Relational low but high high lowimproving
Design a database
Sally’s Pet Store
Supplier
Employee
CustomerSale
Animal
The Pet store’s Database
Name AnimalID Category Breed Dateborn Gender Color
Cathy 123 Bird Lovebird 6/5/98 Male Yellow Terry 353 dog Canary 7/1/97 Female Red Sandy 345 dog African 8/5/98 Male Yellow Charles 232 Bird Parrot 4/6/99 Male Yellow Ruby 565 dog Other 6/4/99 Male Yellow Curtis 675 cat Grey 8/5/98 Male Grey
Employee Name Soc.Sec.No Address Salary AnimalIDDaniel 232423435 ….. $23000 123……… ……… ……
Animal
Structural Query Language (SQL)
• SELECT columns Which property do you want ?
• FROM tables What tables are involved?
• JOIN conditions How are the table joined?
• WHERE criteria What are the constraints?
Four basic commands:
SELECT AnimalID, Category, DatebornFROM AnimalWHERE ((Category=“Dog”) AND (Dateborn>#6/1/98#));
Query
Resulting table:AnimalID Category Dateborn
345 dog 8/5/98 565 dog 6/4/99 SELECT Name, AnimalID, Salary
FROM Employee
WHERE (AnimalID=#123#)
SQL(2)
• List all birds who are male or who are born before 6/1/98 and have red in their color
SELECT AnimalID, Category, Gender, Dateborn, Color
FROM Animal
WHERE (((Category=“Bird”) AND (Gender=“Male”))
OR ((Dateborn<#6/1/98#) AND
(Color Like “*Red*”)))
Back to Database Design
• The main idea is to separate all the 1 to many relationships.– Look at page 243, or to your access quiz for
examples
• Each table holds specific information that is tied to the other tables through ‘keys’ and relationships
• this process is referred to as normalization
Normalization
• Why?– To prevent anomalies - occurrences in the database
that result from inefficient database structure. They also cause harmful events that can change or erase data.
• There are three types of anomalies– insertion– deletion– modification
Insertion anomalies
• Refers to the inability to add data to the database because the tables are not referenced correctly.
• This is to say if we want to add a new element we would have to add to numerous tables, and failure to do so would cause inconsistent data.ex. The inability to enter a new employee w/o adding in multiple tables.
Deletion Anomalies
• This is the opposite of the insertion anomaly. When items are not referenced correctly we can lose information.
• Ex. We have an employee database that does not have a separate department database, if we delete that employee we will lose that dept. info.
Modification Anomalies
• This occurs when you can change the value of one of the attributes (columns) of a dept, but that data is not changed throughout the database, unless we go in and manually change each table.
Other Database Concepts
• Databases on the web– technology makes this really easy to do– the gsc example– makes for more efficient transfer of data as well
as collection, with db security we can have EDI which is the underlying foundation of Electronic Commerce
– related to Online Analytical Processing (OLAP) see book for example
Mgmt Requirements for DB Systems
• Data administration- this is the definition of information requirements and access– usually formulated into a information policy
• Data Planning and modeling - the actual database formation process, what are ‘keys’, attributes, and relationships
• database administration• users- the idea of making your database
useable by all employees, plus support
Database Administration
• Defines and organizes database structure and content
• develops security procedures to safeguard the database
• develops database documentation
• maintains the database mgmt. software
Other terms to look up and know
• Distributed database
• data warehouse
• datamart