construction of a database per weidenman par ab

56
Construction of a database Per Weidenman PAR AB

Upload: teige

Post on 22-Jan-2016

69 views

Category:

Documents


0 download

DESCRIPTION

Construction of a database Per Weidenman PAR AB. Database A collection of data It belongs together It models the ”world” Database management system (DBMS) The database (a collection of interrelated data) Software to manage and access the data. User: Searching Reporting. - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Construction of a database Per Weidenman PAR AB

Construction of adatabase

Per WeidenmanPAR AB

Page 2: Construction of a database Per Weidenman PAR AB

Database•A collection of data•It belongs together•It models the ”world”

Database management system (DBMS)•The database (a collection of interrelated data)•Software to manage and access the data

Page 3: Construction of a database Per Weidenman PAR AB

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

•”Database”•Data Warehouse•etc.

Page 4: Construction of a database Per Weidenman PAR AB

Database management systems (DBMS)•Microsoft Access•Microsoft SQL Server•DB2•Oracle•MySQL•FirebirdSQL•etc.

SQL – Structured Query LanguageA computer language to define andsearch data

Page 5: Construction of a database Per Weidenman PAR AB

Relational databases

Tables containing data, organised inrows and columns

Keys, used for linking data in differenttables

Page 6: Construction of a database Per Weidenman PAR AB

Example

Simple database for collecting and organising statistical papers

Created in Microsoft Access

Page 7: Construction of a database Per Weidenman PAR AB

Paper nameand details

Link todokument(pdf file)

Autors

Page 8: Construction of a database Per Weidenman PAR AB

A database with four tables

Keys

Page 9: Construction of a database Per Weidenman PAR AB

One of the tables, containing paper name and details

One paper on each rowRows containing paper name and other details

Key

Page 10: Construction of a database Per Weidenman PAR AB

The keys are used to link data in the four tables

Page 11: Construction of a database Per Weidenman PAR AB

123456789

111…56

123…44

1234

AaaaBbbbCcccDddd

Table”artiklar”

Table”författare2”

Table”personer2”

Key:artikel_id

Key:artikel_id

Key:person_id

Key:person_id

One paper having 3 autors

One person being the autor of 2 papers

Page 12: Construction of a database Per Weidenman PAR AB

A query: the result of asking the database about papers and autors

One paper and the corresponding 3 autors

One autor and the corresponding 2 papers

Page 13: Construction of a database Per Weidenman PAR AB

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

•”Database”•Data Warehouse•etc.

IT Department

”Business” users

Page 14: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

•Data quality•Data types•Performance•Maximun information•Historical data•Regulation and secrecy

Page 15: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Instead of entering text/data by typing…

Input:transactions

Sales System X

Enter customer name:

User:•Searching•Reporting

Sales System X

Choose customer name:

Volvo Personvagnar ABVolvo Lastvagnar ABVolvo Construction ABVolvo Bussar ABVolvo Logistics AB…

… use, if possible, selection from a list of valid values

Page 16: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Input:transactions

User:•Searching•Reporting

Sales System X

Enter customer age:

Define rules for valid input (values, intervals, etc.)

We dont want:•Negative values•40+•1982

Page 17: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Input:transactions

User:•Searching•Reporting

Handling of missing values …

Missing values should stored as ”null” in the database.Not as 0 (digit zero)

Page 18: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

Data types

Text

Numeric

Page 19: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

Performance

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

Searching for individual records

Creating ”prepared” reports by counting or summing

•Large datasets•Multivariate methods•Iterative estimation•Etc.

Page 20: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Maximum information

Input:transactions

User:•Searching•Reporting

Sales System X

Enter customer age:

34

We need to report on age groups:20-2930-3940-49…

Thus we store age as an interval, not as a value!

The fallacy of beeing too user oriented!

Page 21: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

Sales System X

Customer name:

Customer address:

Order date: Order value:

Table: Orders

CustomerID

Orderdate

Ordervalue

Each new order for a specific customer …

… will be added to table Orders and stored as a ”new row”

Page 22: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

Sales System X

Customer name:

Customer address:

Order date: Order value:

Table: Customers

CustomerID

Customername

Customeraddress

But a new address …

… will probably UPDATE the existing record (row) for the specific customer

Thus, the old value of”customer address” will bedeleted and replaced withthe new value.

But this will do fine forusers focusing onsearching / reporting!

Page 23: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

CustomerID

Customername

Customeraddress

Table: Customers

Table: Customers_history

CustomerID

Customername

Customeraddress From To

Create av new table to contain historic records

Each time a value is UPDATED for a certain customer …

… the complete (previous)record is transfered to thetable Customers_history

Page 24: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

CustomerID

Customername

Customeraddress

Table: Customers

Table: Customers_history

CustomerID

Customername

Customeraddress From To

This structure will make analysisof processes possible

But not easy!

Page 25: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

Regulation and sectrecy

Page 26: Construction of a database Per Weidenman PAR AB

DBMS requirements from a statistical / analytical viewpoint

Currentdata

Current + historicaldata

Operating onindividual records

Operating onmany records

Next on this channel…

Page 27: Construction of a database Per Weidenman PAR AB

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

A database containinghistoric transactions

Page 28: Construction of a database Per Weidenman PAR AB

Board data

PAR / Bisnode database

Tables

Basic company dataOne record per company.Contains name, address, startdate,enddate, line of business, etc.

Historic company dataMany records per company.Contains the accumulated historic records from table FTG

Balance sheet dataOne record per annual report(thus many records per company).Turnover, profit, key ratios, etc.

Board member dataMany records per company andperson.

FTG

FTG_H

BOKSLUT

FUNKTION_PERIOD

And manymore tables!

SerranoStatisticalanalysis

How?Historic names etc.

Sampling for times series

statistics

Page 29: Construction of a database Per Weidenman PAR AB

END

Page 30: Construction of a database Per Weidenman PAR AB

Basic company dataOne record per company.Contains name, address, startdate,enddate, line of business, etc.

Page 31: Construction of a database Per Weidenman PAR AB

Historic company dataMany records per company.Contains the accumulated historic records from table FTG

Page 32: Construction of a database Per Weidenman PAR AB

Balance sheet dataOne record per annual report(thus many records per company).Turnover, profit, key ratios, etc.

Page 33: Construction of a database Per Weidenman PAR AB

Board member dataMany records per company andperson.

Page 34: Construction of a database Per Weidenman PAR AB

SerranoBalance sheet data from different periodstransformed to yearly data records

Page 35: Construction of a database Per Weidenman PAR AB

SerranoHistoric transactions from FTG_Htransformed to yearly data records

Page 36: Construction of a database Per Weidenman PAR AB
Page 37: Construction of a database Per Weidenman PAR AB

Serrano Board DataBalance member data from any mix of startdate, enddata and period lengthtransformed to yearly data records

Page 38: Construction of a database Per Weidenman PAR AB

Summing up register data to annual figures

A

ÅR

Nu3 2 1

Exampel.Register containing balance sheet data:•Number of employes•Turnover•Profit•Tangible assets•Etc.

Exampel.Register containing balance sheet data:•Number of employes•Turnover•Profit•Tangible assets•Etc.

Page 39: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu3 2 1

B

Brutet räkenskapsår

Summing up register data to annual figures

Page 40: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu3 2 1

B

C

Omlagda räkenskapsår

Summing up register data to annual figures

Page 41: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu3 2 1

B

C

D

Missing data

Summing up register data to annual figures

Page 42: Construction of a database Per Weidenman PAR AB

ÅR

Nu3 2 1

B

Förslag:Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden …

Förslag:Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden …

Summing up register data to annual figures

Page 43: Construction of a database Per Weidenman PAR AB

ÅR

Nu3 2 1

B

Förslag:… och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde.

Förslag:… och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde.

Förslag:… samt imputera för fulltäckning under sista året

Förslag:… samt imputera för fulltäckning under sista året

Summing up register data to annual figures

Page 44: Construction of a database Per Weidenman PAR AB

ÅR

Nu3 2 1

B

Summing up register data to annual figures

Database

Page 45: Construction of a database Per Weidenman PAR AB

First exampelRegister based transport statistics for SIKA:

Decreased response burden

Increased understanding of the transportingcompanies (as a complement to the ”usual”fokus on type of goods)

Time series describing economic status and change.

Page 46: Construction of a database Per Weidenman PAR AB

0

20000

40000

60000

80000

100000

120000

140000

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

An

stäl

lda

0

50000

100000

150000

200000

250000

Net

too

msä

ttn

ing

, M

KR

Anställda

Nettoomsättning

Objective:Describing economic status and change in transporting companiesduring the last ten years.Total number of employes and turnover …

Page 47: Construction of a database Per Weidenman PAR AB

80

100

120

140

160

180

200

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Transportbranscherna

BNP

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or turnover growth compared to BNP

Page 48: Construction of a database Per Weidenman PAR AB

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or profit development for different types of freight companies

Page 49: Construction of a database Per Weidenman PAR AB

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or the number of employes in a cohort of new companies.

Page 50: Construction of a database Per Weidenman PAR AB

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

Tables based on balance sheet data from each company

Page 51: Construction of a database Per Weidenman PAR AB

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

What data is needed?Company data including

micro level history.Exactly which companies where active in transport

during each year?

Company data includingmicro level history.

Exactly which companies where active in transport

during each year?

Balance sheet data from all transporting companies for each

year

Balance sheet data from all transporting companies for each

year

Page 52: Construction of a database Per Weidenman PAR AB

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

What data is needed?Company data including

micro level history.Exactly which companies where active in transport

during each year?

Company data includingmicro level history.

Exactly which companies where active in transport

during each year?

Balance sheet data from all transporting companies for each

year

Balance sheet data from all transporting companies for each

year

Faster access to ”last years” data compared to taxation

based registers

Faster access to ”last years” data compared to taxation

based registers

Page 53: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

Page 54: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

Page 55: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

Page 56: Construction of a database Per Weidenman PAR AB

A

ÅR

Nu

B

C

D

3 2 1ACD

ABCD

ABC

Sampling companies for time series statistics