construction of a database per weidenman par ab

Post on 22-Jan-2016

69 Views

Category:

Documents

0 Downloads

Preview:

Click to see full reader

DESCRIPTION

Construction of a database Per Weidenman PAR AB. Database A collection of data It belongs together It models the ”world” Database management system (DBMS) The database (a collection of interrelated data) Software to manage and access the data. User: Searching Reporting. - PowerPoint PPT Presentation

TRANSCRIPT

Construction of adatabase

Per WeidenmanPAR AB

Database•A collection of data•It belongs together•It models the ”world”

Database management system (DBMS)•The database (a collection of interrelated data)•Software to manage and access the data

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

•”Database”•Data Warehouse•etc.

Database management systems (DBMS)•Microsoft Access•Microsoft SQL Server•DB2•Oracle•MySQL•FirebirdSQL•etc.

SQL – Structured Query LanguageA computer language to define andsearch data

Relational databases

Tables containing data, organised inrows and columns

Keys, used for linking data in differenttables

Example

Simple database for collecting and organising statistical papers

Created in Microsoft Access

Paper nameand details

Link todokument(pdf file)

Autors

A database with four tables

Keys

One of the tables, containing paper name and details

One paper on each rowRows containing paper name and other details

Key

The keys are used to link data in the four tables

123456789

111…56

123…44

1234

AaaaBbbbCcccDddd

Table”artiklar”

Table”författare2”

Table”personer2”

Key:artikel_id

Key:artikel_id

Key:person_id

Key:person_id

One paper having 3 autors

One person being the autor of 2 papers

A query: the result of asking the database about papers and autors

One paper and the corresponding 3 autors

One autor and the corresponding 2 papers

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

•”Database”•Data Warehouse•etc.

IT Department

”Business” users

DBMS requirements from a statistical / analytical viewpoint

•Data quality•Data types•Performance•Maximun information•Historical data•Regulation and secrecy

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Instead of entering text/data by typing…

Input:transactions

Sales System X

Enter customer name:

User:•Searching•Reporting

Sales System X

Choose customer name:

Volvo Personvagnar ABVolvo Lastvagnar ABVolvo Construction ABVolvo Bussar ABVolvo Logistics AB…

… use, if possible, selection from a list of valid values

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Input:transactions

User:•Searching•Reporting

Sales System X

Enter customer age:

Define rules for valid input (values, intervals, etc.)

We dont want:•Negative values•40+•1982

DBMS requirements from a statistical / analytical viewpoint

DBMS

Data quality

Input:transactions

User:•Searching•Reporting

Handling of missing values …

Missing values should stored as ”null” in the database.Not as 0 (digit zero)

DBMS requirements from a statistical / analytical viewpoint

Data types

Text

Numeric

DBMS requirements from a statistical / analytical viewpoint

Performance

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

Searching for individual records

Creating ”prepared” reports by counting or summing

•Large datasets•Multivariate methods•Iterative estimation•Etc.

DBMS requirements from a statistical / analytical viewpoint

DBMS

Maximum information

Input:transactions

User:•Searching•Reporting

Sales System X

Enter customer age:

34

We need to report on age groups:20-2930-3940-49…

Thus we store age as an interval, not as a value!

The fallacy of beeing too user oriented!

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

Sales System X

Customer name:

Customer address:

Order date: Order value:

Table: Orders

CustomerID

Orderdate

Ordervalue

Each new order for a specific customer …

… will be added to table Orders and stored as a ”new row”

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

Sales System X

Customer name:

Customer address:

Order date: Order value:

Table: Customers

CustomerID

Customername

Customeraddress

But a new address …

… will probably UPDATE the existing record (row) for the specific customer

Thus, the old value of”customer address” will bedeleted and replaced withthe new value.

But this will do fine forusers focusing onsearching / reporting!

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

CustomerID

Customername

Customeraddress

Table: Customers

Table: Customers_history

CustomerID

Customername

Customeraddress From To

Create av new table to contain historic records

Each time a value is UPDATED for a certain customer …

… the complete (previous)record is transfered to thetable Customers_history

DBMS requirements from a statistical / analytical viewpoint

DBMS

Historical data

Input:transactions

User:•Searching•Reporting

CustomerID

Customername

Customeraddress

Table: Customers

Table: Customers_history

CustomerID

Customername

Customeraddress From To

This structure will make analysisof processes possible

But not easy!

DBMS requirements from a statistical / analytical viewpoint

Regulation and sectrecy

DBMS requirements from a statistical / analytical viewpoint

Currentdata

Current + historicaldata

Operating onindividual records

Operating onmany records

Next on this channel…

DBMSInput:transactions

User:•Searching•Reporting

Statisticalanalysis

Organiseddata

DBMS requirements

A database containinghistoric transactions

Board data

PAR / Bisnode database

Tables

Basic company dataOne record per company.Contains name, address, startdate,enddate, line of business, etc.

Historic company dataMany records per company.Contains the accumulated historic records from table FTG

Balance sheet dataOne record per annual report(thus many records per company).Turnover, profit, key ratios, etc.

Board member dataMany records per company andperson.

FTG

FTG_H

BOKSLUT

FUNKTION_PERIOD

And manymore tables!

SerranoStatisticalanalysis

How?Historic names etc.

Sampling for times series

statistics

END

Basic company dataOne record per company.Contains name, address, startdate,enddate, line of business, etc.

Historic company dataMany records per company.Contains the accumulated historic records from table FTG

Balance sheet dataOne record per annual report(thus many records per company).Turnover, profit, key ratios, etc.

Board member dataMany records per company andperson.

SerranoBalance sheet data from different periodstransformed to yearly data records

SerranoHistoric transactions from FTG_Htransformed to yearly data records

Serrano Board DataBalance member data from any mix of startdate, enddata and period lengthtransformed to yearly data records

Summing up register data to annual figures

A

ÅR

Nu3 2 1

Exampel.Register containing balance sheet data:•Number of employes•Turnover•Profit•Tangible assets•Etc.

Exampel.Register containing balance sheet data:•Number of employes•Turnover•Profit•Tangible assets•Etc.

A

ÅR

Nu3 2 1

B

Brutet räkenskapsår

Summing up register data to annual figures

A

ÅR

Nu3 2 1

B

C

Omlagda räkenskapsår

Summing up register data to annual figures

A

ÅR

Nu3 2 1

B

C

D

Missing data

Summing up register data to annual figures

ÅR

Nu3 2 1

B

Förslag:Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden …

Förslag:Bryt ner flödesvariablerna (omsättning, vinst, etc.) till månadsvärden …

Summing up register data to annual figures

ÅR

Nu3 2 1

B

Förslag:… och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde.

Förslag:… och summera månadsvärdena till ett ’fingerat’ kalenderårsvärde.

Förslag:… samt imputera för fulltäckning under sista året

Förslag:… samt imputera för fulltäckning under sista året

Summing up register data to annual figures

ÅR

Nu3 2 1

B

Summing up register data to annual figures

Database

First exampelRegister based transport statistics for SIKA:

Decreased response burden

Increased understanding of the transportingcompanies (as a complement to the ”usual”fokus on type of goods)

Time series describing economic status and change.

0

20000

40000

60000

80000

100000

120000

140000

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

An

stäl

lda

0

50000

100000

150000

200000

250000

Net

too

msä

ttn

ing

, M

KR

Anställda

Nettoomsättning

Objective:Describing economic status and change in transporting companiesduring the last ten years.Total number of employes and turnover …

80

100

120

140

160

180

200

1997 1998 1999 2000 2001 2002 2003 2004 2005 2006

Transportbranscherna

BNP

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or turnover growth compared to BNP

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or profit development for different types of freight companies

Objective:Describing economic status and change in transporting companiesduring the last ten years.… or the number of employes in a cohort of new companies.

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

Tables based on balance sheet data from each company

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

What data is needed?Company data including

micro level history.Exactly which companies where active in transport

during each year?

Company data includingmicro level history.

Exactly which companies where active in transport

during each year?

Balance sheet data from all transporting companies for each

year

Balance sheet data from all transporting companies for each

year

  Aktiva företag Aktiva aktiebolag BNP

År Totalt

Därav aktie-bolag

Antal anställda

Nettoom-sättning

(Mkr)

Löpande

priser (Mkr)

1997 12912 10599 98259 120284 1927001

1998 12788 10626 100663 127745 2012091

1999 12547 10543 102531 133078 2123971

2000 12562 10704 106811 145496 2249987

2001 12383 10659 112685 163418 2326176

2002 12432 10741 114426 168214 2420761

2003 12616 10935 115135 178294 2515150

2004 12689 11067 118015 188913 2624964

2005 12709 11100 119387 209819 2735218

2006 12514 11012 121683 224225 2899653

What data is needed?Company data including

micro level history.Exactly which companies where active in transport

during each year?

Company data includingmicro level history.

Exactly which companies where active in transport

during each year?

Balance sheet data from all transporting companies for each

year

Balance sheet data from all transporting companies for each

year

Faster access to ”last years” data compared to taxation

based registers

Faster access to ”last years” data compared to taxation

based registers

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

A

ÅR

Nu

B

C

D

3 2 1

Sampling companies for time series statistics

A

ÅR

Nu

B

C

D

3 2 1ACD

ABCD

ABC

Sampling companies for time series statistics

top related