a powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database...

26
2015-2016

Upload: others

Post on 21-Apr-2020

9 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

2015-2016

Page 2: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Phil Smith

Page 3: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Learning outcomesOn successful completion of this unit you will:

1. Understand data models and database technologies. (Assignment 1)

2. Today will complete LO1

Page 4: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

RecapLast lesson – Normalisation.

Page 5: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

TodayFirst task is to progress the activity started last lesson on normalisation.

You have 40 minutes for this and remember this will be part of assignment 1.

Then we will cover new developments, this is the final part of learning outcome 1.

Finally Assignment 1 will be reviewed and issued.

Page 6: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Task 1First task is to progress the activity started last lesson on normalisation.

You have 40 minutes for this and remember this will be part of assignment 1.

Page 7: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

New developments data mining and data warehousing

dynamic storage

web enabled database applications

other developments e.g. multimedia databases, document management systems, digital libraries

I will cover data warehousing but then you will research and present the other “new” developments listed above.

Page 8: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

data mining and data warehousing data mining and data warehousing

Page 9: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Motivation and context

“Modern organizations are drowning in data but

starving for information”.

Operational processing (transaction processing)

captures, stores and manipulates data to support

daily operations. The main thrust of this unit.

Information processing is the analysis of data or other

forms of information to support decision making.

Data warehouse can consolidate and integrate

information from many internal and external sources

and arrange it in a meaningful format for making

business decisions.

Page 10: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Definition Data Warehouse: (W.H. Immon)

A subject-oriented, integrated, time-variant, non-updatable collection of data used in support of management decision-making processes.

Subject-oriented: e.g. customers, patients, students, products.

Integrated: Consistent naming conventions, different formats, from multiple data sources.

Time-variant: Can study trends and changes.

Nonupdatable: Read-only, periodically refreshed.

Page 11: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Need for Data Warehousing Integrated, company-wide view of high-quality

information (from disparate databases)

Separation of operational and informational systems and data (for improved performance)

Table 11-1: comparison of operational and informational systems

Page 12: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Meaning of disparate

Definition of disparate in English:

adjective

Essentially different in kind; not able to be compared http://www.oxforddictionaries.com/definition/english/disparate

In our case it relate to different sources of data.

E.g. Sql server, Mysql, Excel, Access etc

Page 13: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Figure 1: Generic architecture

E

T

L

One, company-wide warehouse

Periodic extraction data is not completely current in warehouse

Page 14: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

14

The ETL ProcessCapture

Scrub or data cleansing

Transform

Load and Index

ETL = Extract, transform, and load

Page 15: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

15

Company Facts

The data warehouse will have a table of facts,

usually specified by business analysts and

implemented by data analysts who

understand the warehouse topology.

The facts are predefined objects (e.g. stored

procedures/view) which when combined can

produce information for decision making. The

data itself can be derived from multiple

sources.

Page 16: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

16

Data Mining Goals:

Explain observed events or conditions

Confirm hypotheses

Explore data for new or unexpected relationships

Techniques Case-based reasoning

Rule discovery

Signal processing

Neural nets

Fractals

Page 17: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Data Mining Data mining is knowledge discovery using a blend

of statistical, AI, and computer graphics techniques.

New buzzword, old idea.

Inferring new information from already collected data.

Traditionally job of Data Analysts.

Computers have changed this. Far more efficient to comb through data using a machine than eyeballing statistical data.

Page 18: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Data Mining – Two Main Components Knowledge Discovery

Concrete information gleaned from known data. Data you may not have known, but which is supported by recorded facts.

Knowledge PredictionUses known data to forecast future trends, events, etc. (ie: Stock market predictions)

Page 19: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Data Mining vs. Data Analysis In terms of software and the marketing thereof

Data Mining != Data Analysis

Data Mining implies software uses some intelligence over simple grouping and partitioning of data to infer new information.

Data Analysis is more in line with standard statistical software (ie: web stats). These usually present information about subsets and relations within the recorded data set (ie: browser/search engine usage, average visit time, etc. )

Page 20: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Key Component of Data Mining Whether Knowledge Discovery or Knowledge

Prediction, data mining takes information that was once quite difficult to detect and presents it in an easily understandable format (i.e.: graphical or statistical)

Data mining Techniques involve sophisticated algorithms, including Decision Tree Classifications, Association detection, and Clustering.

Data mining goes hand in hand with data warehouses.

Page 21: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Uses of Data Mining AI/Machine Learning

Combinatorial/Game Data MiningGood for analyzing winning strategies to games, and thus developing intelligent AI opponents. (ie: Chess)

Business StrategiesMarket Basket AnalysisIdentify customer demographics, preferences, and purchasing patterns.

Risk AnalysisProduct Defect AnalysisAnalyze product defect rates for given plants and predict possible complications (read: lawsuits) down the line.

Page 22: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Data warehouse/mining This was a very brief overview of data warehousing and

data mining.

We shall re-visit data warehousing and data mining next semester in the unit on Distributed computing.

Page 23: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

New developments Task 2

Research the following terms

You need to include –

What the term means

How it may be used

1. dynamic storage

2. data mining and data warehousing

3. web enabled database applications

4. other developments eg multimedia databases, document management systems, digital libraries

You will be asked to give a precise of your research.

Page 24: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

New developments

synonyms: summarize, sum up, give a summary/synopsis/precis of, give the main points of; abridge, condense, shorten, synopsize,abstract, outline,"another strategy for improving your writing skills is to precis a passage"

make a precis of (a text or speech).

Page 25: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

New developments Assignment 1

Review and issue.

Page 26: A Powerpointwiki.hct.ac.uk/_media/computing/hnd/hndu33_lecture05.pdf · web enabled database applications other developments e.g. multimedia databases, document management systems,

Summary

What have we learnt today?