![Page 1: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/1.jpg)
DATA DATA
RESOURCERESOURCE
MANAGEMENTMANAGEMENT
![Page 2: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/2.jpg)
Data Hierarchy in a Computer System
![Page 3: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/3.jpg)
Entitities and Attributes
![Page 4: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/4.jpg)
Data redundancy
Program-Data dependence
Lack of flexibility
Poor security
Lack of data-sharing and availability
Problems with the Traditional File Environment
![Page 5: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/5.jpg)
Traditional File Processing
Figure 7-3
![Page 6: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/6.jpg)
• Creates and maintains databasesCreates and maintains databases
• Eliminates requirement for data definition Eliminates requirement for data definition statementsstatements
• Acts as interface between application Acts as interface between application programs and physical data filesprograms and physical data files
• Separates logical and physical views of dataSeparates logical and physical views of data
Database Management System (DBMS)
![Page 7: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/7.jpg)
The Contemporary Database Environment
![Page 8: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/8.jpg)
Components of DBMS
• Data definition language: Data definition language: Specifies Specifies content and structure of database and content and structure of database and defines each data elementdefines each data element
• Data manipulation language:Data manipulation language:Manipulates data in a databaseManipulates data in a database
• Data dictionary:Data dictionary: Stores definitions of Stores definitions of data elements, and data characteristicsdata elements, and data characteristics
![Page 9: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/9.jpg)
Sample Data Dictionary Report
![Page 10: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/10.jpg)
Figure 7-6
Relational Data Model
![Page 11: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/11.jpg)
Three Basic Operations in a Relational Database
• Select:Select: Creates subset of rows that meet Creates subset of rows that meet specific criteriaspecific criteria
• Join:Join: Combines relational tables to provide Combines relational tables to provide users with informationusers with information
• Project:Project: Enables users to create new tables Enables users to create new tables containing only relevant informationcontaining only relevant information
![Page 12: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/12.jpg)
Figure 7-7
Three Basic Operations in a Relational Database
![Page 13: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/13.jpg)
FLAT FILE – NOT NORMALIZED
![Page 14: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/14.jpg)
A Normalized Relation of ORDER
![Page 15: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/15.jpg)
Ensuring Database Integrity
Database integrity involves the maintenance of the logical and business rules of the database.
There are two kinds of “DB Integrity” that must be addressed: Entity Integrity Referential Integrity
![Page 16: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/16.jpg)
Entity Integrity
Entity integrity deals with within-entity rules.
These rules deal with ranges and the permission of null values in attributes or possibly between records
![Page 17: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/17.jpg)
Examples of Entity Integrity
Data Type Integrity: very common and most basic. Checks only for “data type” compatibility with DB Schema, such as: numeric, character, logical, date format, etc.
Commonly referred to in GIS manuals as: Range and List domains
Ranges - acceptable Numeric ranges for input List - acceptable text entries or drop-down lists.
![Page 18: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/18.jpg)
Enforcing Integrity
Not a trivial task! Not all database management systems or
GIS software enable users to “enforce data integrity” during attribute entry or edit sessions.
Therefore, the programmer or the Database Administrator must enforce and/or check for “Integrity.”
![Page 19: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/19.jpg)
Referential Integrity
Referential integrity concerns two or more tables that are related.
Example: IF table A contains a foreign key that matches the primary key of table B THEN values of this foreign key either match the value of the primary key for a row in table B or must be null.
Necessary to avoid: Update anomaly, Delete anomaly.
![Page 20: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/20.jpg)
Basic SQL Commands
SELECT: Specifies columns
FROM: Identifies tables or views
WHERE: Specifies conditions
Querying Databases: Elements of SQL
![Page 21: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/21.jpg)
Using SQL- Structured Query Language SQL is a standard database protocol,
adopted by most ‘relational’ databases Provides syntax for data:
Definition Retrieval Functions (COUNT, SUM, MIN, MAX, etc) Updates and Deletes
![Page 22: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/22.jpg)
SQL Examples
CREATE TABLE SALESREP Item definition expression(s)
{item, type, (width)}
DELETE table WHERE expression
![Page 23: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/23.jpg)
Data Retrieval
SELECT list FROM table WHERE condition list - a list of items or * for all items
WHERE - a logical expression limiting the number of records selected
can be combined with Boolean logic: AND, OR, NOT
ORDER may be used to format results
![Page 24: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/24.jpg)
UPDATE tables
SET item = expression WHERE expression INSERT INTO table VALUES …..
![Page 25: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/25.jpg)
Database Normalization
Normalization: The process of structuring data to minimize duplication and inconsistencies.
The process usually involves breaking down a single Table into two or more tables and defining relationships between those tables.
Normalization is usually done in stages, with each stage applying more rigorous rules to the types of information which can be stored in a table.
![Page 26: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/26.jpg)
Normalization
Normalization: a process for analyzing the design of a relational database Database Design - Arrangement of attributes
into entities It permits the identification of potential
problems in your database design Concepts related to Normalization:
KEYS and FUNCTIONAL DEPENDENCE
![Page 27: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/27.jpg)
Ex: Database Normalization (1)
Sample Student Activities DB Table
Poorly Designed Non-unique records
John Smith
Test the Design by developing sample reports and queries
![Page 28: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/28.jpg)
Created a unique “ID” for each Record in the Activities Table
Required the creation of an “ID” look-up table for reporting (Students Table)
Converted the “Flat-File into a Relational Database
Ex: Database Normalization (2)
![Page 29: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/29.jpg)
Ex: Database Normalization (3)
Wasted Space Redundant data entry What about taking a 3rd
Activity? Query Difficulties - trying
to find all swimmers Data Inconsistencies -
conflicting prices
![Page 30: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/30.jpg)
Ex: Database Normalization (4)
Students table is fine Elimination of two
columns and an Activities Table restructuring, Simplifies the Table
BUT, we still have Redundant data (activity fees) and data insertion anomalies.
Problem: If student #219 transfers we lose all references to Golf and its price.
![Page 31: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/31.jpg)
Ex: Database Normalization (5)
Modify the Design to ensure that “every non-key field is dependent on the whole key”
Creation of the Participants Table, corrects our problems and forms a union between 2 tables.
This is a Better Design!
![Page 32: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/32.jpg)
The Normal Forms
A series of logical steps to take to normalize data tables
First Normal Form Second Third Boyce Codd There’s more, but beyond scope of this
![Page 33: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/33.jpg)
First Normal Form (1NF)
All columns (fields) must be atomic Means : no repeating items in columns
OrderDate Customer Items11/30/1998 Joe Smith Hammer, Saw, Nails
OrderDate Customer Item1 Item2 Item311/30/1998 Joe Smith Hammer Saw Nails
Solution: make a separate table for each set of attributes with a primary key (parser, append query)
CustomersCustomerIDName
OrdersOrderIDItem CustomerIDOrderDate
![Page 34: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/34.jpg)
Second Normal Form (2NF)
In 1NF and every non-key column is fully dependent on the (entire) primary key Means : Do(es) the key field(s) imply the rest of the fields? Do we
need to know both OrderID and Item to know the Customer and Date? Clue: repeating fields
Solution: Remove to a separate table (Make Table)
OrderID Item CustomerID OrderDate1 Hammer 1 11/30/19981 Saw 1 11/30/19981 Nails 1 11/30/1998
OrderDetailsOrderIDItem
OrdersOrderIDCustomerIDOrderDate
![Page 35: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/35.jpg)
Third Normal Form (3NF)
In 2NF and every non-key column is mutually independent means : Calculations
•Solution: Put calculations in queries and forms
Item Quantity Price TotalHammer 2 $10 $20Saw 5 $40 $200Nails 8 $1 $8
OrderDetailsOrderIDItemQuantityPrice
Put expression in text control or in query:=Quantity * Price
![Page 36: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/36.jpg)
Data warehouse
Supports reporting and query tools
Stores current and historical data
Consolidates data for management analysis and decision making
Data Warehousing and Datamining
![Page 37: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/37.jpg)
What is a Data Warehouse?
"A warehouse is a subject-oriented, integrated, time-variant and non-volatile collection of data in support of management's decision making process". Bill Inmon (1990)
"A Data Warehouse is a repository of integrated information, available for queries and analysis. Data and information are extracted from heterogeneous sources as they are generated.…”Anonymous
![Page 38: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/38.jpg)
Components of a Data Warehouse
![Page 39: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/39.jpg)
Data Mining
ON-LINE ANALYTICAL PROCESSING (OLAP): ability to manipulate, analyze large volumes of data from multiple perspectives
MINING: Seeking relationships that are not known in advance. A function of the software and data organization.
![Page 40: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/40.jpg)
DW Characteristics
Subject Oriented:Data that gives information about a particular subject instead of about a company's ongoing operations.
Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole.
Time Variant: All data in the data warehouse is identified with a particular time period.
![Page 41: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/41.jpg)
Data Acquisition
The process of moving company data from the source systems into the warehouse.
Often the most time-consuming and costly effort.
Performed with software products known as ETL (Extract/Transform/Load) tools.
Over 50 ETL tools on market.
![Page 42: DATARESOURCEMANAGEMENT. Data Hierarchy in a Computer System](https://reader036.vdocuments.site/reader036/viewer/2022062309/56649d9f5503460f94a89c1f/html5/thumbnails/42.jpg)
Data Cleansing
Typically performed in conjunction with data acquisition.
A complicated process that validates and, if necessary, corrects the data before it is inserted.
AKA "data scrubbing" or "data quality assurance".