decision support and data warehouse. decision supports systems components data management function...
Post on 19-Dec-2015
222 views
TRANSCRIPT
Decision supports Systems Components
• Data management function– Data warehouse
• Model management function– Analytical models:
• Statistical model, management science model
• User interface– Data visualization
On-Line Analytical Processing (OLAP)• The use of a set of graphical tools that
provides users with multidimensional views of their data and allows them to analyze the data using simple windowing techniques
• OLAP Operations– Cube slicing–come up with 2-D view of data– Drill-down–going from summary to more
detailed views– Roll-up – the opposite direction of drill-down– Reaggregation – rearrange the order of
dimensions
Example of drill-down
Summary report
Drill-down with color added
Starting with summary data, users can obtain details for particular cells
Data Warehouse• A subject-oriented, integrated, time-variant,
non-updatable collection of data used in support of management decision-making processes– Subject-oriented: e.g. customers, employees,
locations, products, time periods, etc.• Dimensions for data analysis
– Integrated: Consistent naming conventions, formats, encoding structures; from multiple data sources
– Time-variant: Can study trends and changes– Nonupdatable: Read-only, periodically refreshed
Data Warehouse Design- Star Schema -
• Fact table– contain detailed business data
• Ex. Line items of orders to compute total sales by product, by salesperson.
• Dimension tables– Dimension is a term used to describe any category or subjects of
the business used in analyzing data, such as customers, employees, locations, products, time periods, etc.
– contain descriptions about the subjects of the business such as customers, employees, locations, products, time periods, etc.
– Example: A sold item related to many business subjects such as salesperson, customer, product and time period.
Example:Order Processing System
Customer Order
Product
Has
Has
1 M
M
M
CID Cname City OID ODate
PIDPname
Price
RatingSalesPerson
Qty
Star Schema
FactTableLocationCodePeriodCode
RatingPIDQty
Amount
LocationDimension
LocationCodeStateCity
CustomerRatingDimension
RatingDescription
ProductDimension
PIDPname
Category
PeriodDimensionPeriodCode
YearQuarter
Can group by State, City
Snowflake Schema
FactTableLocationCodePeriodCode
RatingPIDQty
Amount
LocationDimension
LocationCodeStateCity
CustomerRatingDimension
RatingDescription
ProductDimension
PIDPname
CategoryID
ProductCategory
CategoryIDDescription
PeriodDimensionPeriodCode
YearQuarter
Can group by State, City
The ETL Process
E
T
LOne, company-wide warehouse
Periodic extraction data is not completely current in warehouse
The ETL Process
• Capture/Extract• Transform
– Scrub(data cleansing),derive– Example:
• City -> LocationCode, State, City• OrderDate -> PeriodCode, Year, Quarter
• Load and Index
ETL = Extract, transform, and load
From SalesDB to MyDataWarehouse
• Extract data from SalesDB:– Create query to get the data– Download to MyDataWareHouse
• File/Import/Save as Table
• Transform:– Transform City to Location– Transform Odate to Period
• Query FactPC
• Load data to FactTable
Star schema example
Fact table provides statistics for sales broken down by product, period and store dimensions
Dimension tables contain descriptions about the subjects of the business
SQL GROUPING SETS
• GROUPING SETS– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS
– GROUP BY GROUPING SETS(CITY,RATING,(CITY,RATING),())
– ORDER BY CITY;
• Note: Compute the subtotals for every member in the GROUPING SETS. () indicates that an overall total is desired.
ResultsCITY Rating COUNT(CID)-------------------- - ---------- ------------------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 A 4 8
CITY R COUNT(CID)-------------------- - ---------- B 3 C 1
SQL CUBE
• Perform aggregations for all possible combinations of columns indicated.– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS
– GROUP BY CUBE(CITY,RATING)
– ORDER BY CITY, RATING;
ResultsCITY Rating COUNT(CID)-------------------- - ------- ----------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 A 4 B 3
CITY R COUNT(CID)-------------------- - ---------- C 1 8
SQL ROLLUP
• The ROLLUP extension causes cumulative subtotals to be calculated for the columns indicated. If multiple columns are indicated, subtotals are performed for each of the columns except the far-right column.– SELECT CITY,RATING,COUNT(CID) FROM CUSTOMERS– GROUP BY ROLLUP(CITY,RATING)– ORDER BY CITY, RATING;
Results
CITY Rating COUNT(CID)-------------------- - ----------CHICAGO A 1CHICAGO B 2CHICAGO 3LOS ANGELES A 1LOS ANGELES C 1LOS ANGELES 2SAN FRANCISCO A 2SAN FRANCISCO B 1SAN FRANCISCO 3 8