data warehousing, access, analysis, mining, and visualization

Upload: dr-singh

Post on 06-Apr-2018

221 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    1/45

    1

    CHAPTER 4

    Data Warehousing, Access,

    Analysis, Mining, and Visualization

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    2/45

    2

    Data Warehousing, Access,Analysis, Mining,and

    Visualization

    MSS foundation

    Many new concepts

    Object-oriented databases

    Intelligent databases

    Data warehouse

    Data mining

    Online analytical processing

    Multidimensionality

    Internet / Intranet / Web

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    3/45

    3

    Data Warehousing, Access,Analysis,and Visualization

    What to do with all the data that organizationscollect, store, and use?(Information overload!)

    Solution

    Data warehousing

    Data access

    Data mining

    Online analytical processing (OLAP)

    Data visualization

    Data sources

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    4/45

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    5/45

    5

    The Natureand SourcesofData

    Data: Raw

    Information: Data organized to convey meaning

    Knowledge: Data items organized and processed to

    convey understanding, experience, accumulated

    learning, and expertise

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    6/45

    6

    DSS Data Items

    Documents

    Pictures

    Maps

    Sound

    Animation

    Video

    Can be hard or soft

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    7/45

    7

    Data Sources Internal

    External

    Personal

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    8/45

    8

    Data Collection, Problems,and Quality

    Problems (Table 4.1)

    Quality: determines usefulness of data

    Intrinsic data quality

    Accessibility data quality

    Representation data quality

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    9/45

    9

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    10/45

    10

    Data Quality Issuesin

    Data Warehousing Uniformity

    Version Completeness check

    Conformity check

    Genealogy check (drill down)

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    11/45

    11

    TheInternet andCommercialDatabase

    Services

    For external data

    The Internet: major supplier of external data

    Commercial Data Banks: sell access to specialized

    databases

    Can add external data to the MSS in a timely

    manner and at a reasonable cost

    Decision Support Systems and Intelligent Systems, Efraim Turban and Jay E. Aronson, 6th edition

    Copyright 2001, Prentice Hall, Upper Saddle River, NJ

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    12/45

    12

    The InternetandCommercialDatabases

    ServersUse Web Browsers to

    Access vital information by employees and

    customers

    Implement executive information systems

    Implement group support systems (GSS)

    Database management systems provide data in

    HTML, on Web servers directly

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    13/45

    13

    Database Management SystemsinDSS

    DBMS: Software program for entering (or adding)

    information into a database; updating, deleting,

    manipulating, storing, and retrieving information

    A DBMS + modeling language to develop DSS

    DBMS to handle LARGE amounts of information

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    14/45

    14

    Database Organizationand Structure

    Relational databases

    Hierarchical databases

    Network databases Object-oriented databases

    Multimedia-based databases

    Document-based databases Intelligent databases

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    15/45

    15

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    16/45

    16

    Data Warehousing

    Physical separation of operational and decision support

    environments

    Purpose: to establish a data repository making operational data

    accessible

    Transforms operational data to relational form

    Only data needed for decision support come from the TPS

    Data are transformed and integrated into a consistent structure

    Data warehousing (information warehousing): solves the data

    access problem

    End users perform ad hoc query, reporting analysis and

    visualization

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    17/45

    17

    Data Warehousing Benefits

    Increase in knowledge worker productivity

    Supports all decision makers data requirements

    Provide ready access to critical data

    Insulates operation databases from ad hocprocessing

    Provides high-level summary information

    Provides drill down capabilities

    Yields Improved business knowledge

    Competitive advantage

    Enhances customer service and satisfaction

    Facilitates decision making

    Help streamline business processes

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    18/45

    18

    Data WarehouseArchitectureand Process

    Two-tier architecture

    Three-tier architecture

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    19/45

    19

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    20/45

    20

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    21/45

    21

    Data Warehouse Components Large physical database

    Logical data warehouse

    Data mart Decision support systems (DSS) and executive

    information system (EIS)

    Can feed OLAP

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    22/45

    22

    Data Marts

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    23/45

    23

    DW Suitability

    For organizations where

    Data are in different systems

    Information-based approach to management in use

    Large, diverse customer base

    Same data have different representations in different

    systems

    Highly technical, messy data formats

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    24/45

    24

    CharacteristicsofDataWarehousing

    1. Data organized by detailed subject with

    information relevant for decision support

    2. Integrated data

    3. Time-variant data

    4. Non-volatile data

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    25/45

    25

    OLAP: Data AccessandMining, Querying,and

    Analysis

    Online analytical processing (OLAP)

    DSS and EIS computing done by end-users in online

    systems

    Versus online transaction processing (OLTP)

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    26/45

    26

    OLAP Activities Generating queries

    Requesting ad hoc reports

    Conducting statistical and other analyses

    Developing multimedia applications

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    27/45

    27

    OLAP usesthedata warehouseandasetoftools,usually with

    multidimensional capabilities

    Query tools

    Spreadsheets

    Data mining tools

    Data visualization tools

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    28/45

    28

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    29/45

    29

    UsingSQL for Querying

    SQL (Structured Query Language)

    Data language

    English-like, nonprocedural, very user friendly

    language

    Free format

    Example:

    SELECT Name, Salary

    FROM Employees

    WHERE Salary >2000

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    30/45

    30

    Data Miningfor

    Knowledge discovery in databases

    Knowledge extraction

    Data archeology

    Data exploration

    Data pattern processing

    Data dredging

    Information harvesting

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    31/45

    31

    MajorData MiningCharacteristicsand Objectives

    Data are often buried deep

    Client/server architecture

    Sophisticated new tools--including advanced visualization

    tools--help to remove the information ore End-user miner empowered by data drills and other power

    query tools with little or no programming skills

    Often involves finding unexpected results

    Tools are easily combined with spreadsheets, etc.

    Parallel processing for data mining

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    32/45

    32

    Data Mining Application Areas

    Marketing

    Banking

    Retailing and sales

    Manufacturing and production

    Brokerage and securities trading

    Insurance

    Computer hardware and software

    Government and defense Airlines

    Health care

    Broadcasting

    Law enforcement

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    33/45

    33

    IntelligentData Mining

    Use intelligent search to discover information withindata warehouses that queries and reports cannoteffectively reveal

    Find patterns in the data and infer rules from them

    Use patterns and rules to guide decision making andforecasting

    Five common types of information that can be yieldedby data mining: 1) association, 2) sequences, 3)classifications, 4) clusters, and 5) forecasting

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    34/45

    34

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    35/45

    35

    Main ToolsUsedinIntelligentData Mining

    Case-based Reasoning

    Neural Computing

    Intelligent Agents

    Other Tools

    Decision trees Rule induction

    Data visualization

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    36/45

    36

    Data VisualizationandMultidimensionality

    Data Visualization Technologies

    Digital images

    Geographic information systems

    Graphical user interfaces

    Multidimensions

    Tables and graphs

    Virtual reality

    Presentations

    Animation

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    37/45

    37

    Multidimensionality

    3-D + Spreadsheets (OLAP has this)

    Data can be organized the way managers like to seethem, rather than the way that the system analysts do

    Different presentations of thesame data can bearranged easily and quickly

    Dimensions: products, salespeople, market segments,business units, geographical locations, distributionchannels, country, or industry

    Measures: money, sales volume, head count, inventoryprofit, actual versus forecast

    Time: daily, weekly, monthly, quarterly, or yearly

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    38/45

    38

    MultidimensionalityLimitations

    Extra storage requirements

    Higher cost

    Extra system resource and time consumption More complex interfaces and maintenance

    Multidimensionality is especially popular in

    executive information and support systems

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    39/45

    39

    Geographic Information

    Systems (GIS) A computer-based system for capturing, storing,

    checking, integrating, manipulating, and displaying

    data using digitized maps

    Spatially-oriented databases

    Useful in marketing, sales, voting estimation, planned

    product distribution

    Available via the Web

    Can use with GPS

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    40/45

    40

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    41/45

    41

    Virtual Reality

    An environment and/or technology that provides

    artificially generated sensory cues sufficient to

    engender in the user some willing suspension of

    disbelief

    Can share data and interact

    Can analyze data by creating a landscape

    Useful in marketing, prototyping aircraft designs

    VR over the Internet through VRML

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    42/45

    42

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    43/45

    43

    Business Intelligence

    onthe Web Can capture and analyze data from Web

    Tools deployed on Web

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    44/45

    44

    Summary

    Data for decision making come from internal andexternal sources

    The database management system is one of the

    major components of most management supportsystems

    Familiarity with the latest developments is critical

    Data contain a gold mine of information if they candig it out

    Organizations are warehousing and mining data

    Multidimensional analysis tools and new enterprise-wide system architectures are useful

    OLAP tools are also useful

  • 8/3/2019 Data Warehousing, Access, Analysis, Mining, And Visualization

    45/45

    45

    Summary (contd.)

    New data formats for multimedia DBMS

    Internet and intranets via Web browser

    interfaces for DBMS access

    Built-in artificial intelligence methods inDBMS