conceptual modelling of administrative register information and xml - taxation metadata as an...
TRANSCRIPT
CONCEPTUAL MODELLING OF ADMINISTRATIVE REGISTER INFORMATION AND XML
- TAXATION METADATA AS AN EXAMPLE
Ottawa, 16-18 May 2005
Heikki Rouhuvirta, Statistical Methodology R&D
2.5.2005 2Heikki Rouhuvirta
Contents
Background The Challenge Primary Questions Test Case – Finnish Taxation Data Semantics of Register Data Taxation Metadata Definition Some Results The Future Some Practical Steps on the Way
2.5.2005 3Heikki Rouhuvirta
Background
Present state of compilation of administrative data as the challenge
CoSSI as the methodological framework for data semantics of
registers Codacmos
as the organizational base for concept testing
2.5.2005 4Heikki Rouhuvirta
Data Combining
DataSource
(e.g.RDB)
DataSource
datagathering(e.g. SQL)
Data communicationnetwork
(internet, WAN)
transmissionfile
(sequential/Flat File)
transmissionfile
DataStore
DataStore
physical media(CDROM, magnetic
tape)
tailor-made programsor ETL products
(e.g. Informatica, Oracle)
tailor-made programsor ETL products
(e.g. Informatica, Oracle)
Ope
ration
al syste
ms o
r
Data
Ware
ho
uses
Relational DB
FTP (+ VPN)
Destination NSI
Administrative Data Source
Statistical Register Data
HandbookOf
Taxationetc.
DataStore
(sequential/Flat File)
(Flat File)
StatisticalApplication
StatisticalInformation
DataStore
Relational DB
Survey Data
statistician
datagathering(e.g. SQL)
dataextraction/
transformation/loading
Present state of compilation of administrative data
2.5.2005 5Heikki Rouhuvirta
CoSSI
Common Structure of Statistical Information – CoSSI covers different ways of statistical data organization (statistical data
matrix and statistical table) includes a model to define contentual information in statistics Includes a model to define the methodology used in statistics (e.g.
measuring and classification) manages the complexity of statistical information (e.g. nested
variables structure) includes definitions for all types of the statistical information, data,
metadata for files, statistical metadata, quality declarations, charts the main objective was to organise statistical data so that they also
contain statistical metadata (describing both the structure and logic of statistical metadata at the same time)
Definition Descriptions available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/cossi_definition_descriptions_v_09_2003.pdf
Statistical metadata see also from the web: http://www.stat.fi/org/tut/dthemes/papers/alternative_approach_to_metadata_codacmos_2004.pdf
2.5.2005 6Heikki Rouhuvirta
Codacmos
Cluster of Data Collection Integration & Metadata Systems for Official Statistics
EU Project 2003- 2004 (IST-2001-38636) Consortium:
Italian National Statistical Institute, Statistics Finland, University Of Edinburgh, National Statistical Service of Greece, DESAN Research Solutions, Statistical Division Of Municipality Of Milan, The Finnish Tax Administration, University Of Patras, Institute Of Informatics And Statistics, University Of Athens, National Social Security Institute, Tietokarhu Ltd, Statistics Norway
http://www.codacmos.eu.org TAXATION METADATA Partners: Statistics Finland, The
Finnish Tax Administration and Tietokarhu Ltd
2.5.2005 7Heikki Rouhuvirta
The Challenge:
how the present process, where the description of administrative data can mostly be read from the authorities' administrative handbooks, can be transformed into such that it meets the requirements for the usability and presence of the contentual description of data both in the production process to statistics producers and in the distribution of statistical information to users of statistics.
2.5.2005 8Heikki Rouhuvirta
Primary Questions
what are the metadata of administrative data? how to process the metadata specifying the interpretation
and use of administrative data collection and register data? how to combine the original data description (e.g. concept
definitions of register fields) to variable description and measurement information of statistics?
can accumulating interpretive metadata be “transported” in processing of information and if can, how?
2.5.2005 9Heikki Rouhuvirta
Test Case – Finnish Taxation
(Finnish taxation on the web at: http://www.vero.fi)
2.5.2005 10Heikki Rouhuvirta
type of income (income) income source
earned incomeCapital income
income accruing from wealth(personal income) business income
agriculturalincome
salaryrenumeration
based onregular
employment orthe performance
of personallabour services
(others)
pension
(others)
income fromforeign
employmentwages receivedfrom work done
abroad
capital gainsincome accruing
from profit onthe sale or
exchange of theproperty or
assets
fringe benefitbenefit from anemployer in the
other thanmonetary form
as part ofnormal wages
or salary
(others)
Taxation: Types and Sources of income
2.5.2005 11Heikki Rouhuvirta
(taxable income)
deductionamount deducted from tax, income subject to taxor net worth before final determination of tax to
be paid
taxable incomeincome remaining after deduction from
income subject to tax and serving as thebasis for determinig final tax
expense incurres in acquiring ormaintaining income
expense of financial loss incurred inacquiring or maintaining income
(Natural deduction)
standard deduction forworkrelated expenses
deduction from wages grantedofficio by the tax authorities,
replacing up to a certain sum thededuction for expenses incurred in
acquiring or maintaining income
deduction for pensioninsurance premiums
deduction allowed from earnedincome, determined on the
basis of premiums paid by thetaxpayer for pension insurance
and additional pensioninsurance
discretionary allowance forcircumstantial incapacity to
pay taxesdeduction allowed from earned
income on the basis of thetaxpayer's reduced ability to
pay taxes
Pension income allowancededuction granted ex officio bythe tax authorities on the basisof wages or business income
(deduction from pureincome)
(deduction from taxableincome)
(deduction from other than pureincome)
(others)(others)
(others)(others)
deduction for travelexpenses
deduction allowed fromearned income,
determined on the basisof the costs of travellingto and from home and
work
earned incomeallowance
deduction granted exofficio by the tax
authorities on the basisof wages or business
income
Low-income allowancededuction from earned income
granted ex officio by the taxauthorities to persons with a low
income
child maintenance creditdeduction from income taxgranted in state taxation,
based on child maintenancepayments made by the
taxpayer
domestic help creditdeduction granted from incometax, based on the payment for
or employer's contributionspaid on work that the taxpayerhas done in his/her home or
leisure-time residence
deduction for interestexpenses
deduction for interest expensesrelated to business income,
agricultural income, or capitalincome
credit for capital income deficitdeduction granted from tax onearned income on the basis ofexpenses incurred in acquiring
income, interest expenses or certainforms of financial losses when the
taxpayer has insufficient capitalincome to claim deductions for these
Income tax deductions
2.5.2005 12Heikki Rouhuvirta
Data Semantics of Register Data
Modelling methodology:starting point is to distinguish between
substance concept model andinformation model whereby the concepts are
described Information organizing method:
any which doesn't lose information Technology:
any without restrictions Result:
Taxation metadata definition (taxmeta.dtd)
2.5.2005 13Heikki Rouhuvirta
Basic Substance Concept
Tax type: i.e. Personal
taxation
Type of income:i.e. earned
income, capital income
Income:i.e. salary,
pension
Type of tax
deduction
Deduction
A)
B)
2.5.2005 14Heikki Rouhuvirta
Description Information
Internal instruction:Instruction on spesific income
and deduction area
Law:reference to a section of law
Law case:
reference to a law case
Formula:How the tax is
calculated
Income:i.e. salary,
pension
Deduction
1)
2)
3)
2.5.2005 15Heikki Rouhuvirta
Taxation Metadata Definition (taxmeta.dtd)
Available on the web at: http://www.stat.fi/org/tut/dthemes/drafts/taxmeta_dtd_v_01.txt
2.5.2005 16Heikki Rouhuvirta
Taxation Metadata
type of income
taxation type
type of deduction tax taxation lawtaxationname
taxationdescription
incometypenameincometypedescription
computing
. . .
+
*
#PCDATA
#PCDATA
#PCDATA
#PCDATA
taxation law*
income
deductiontypename
deductiontypedescription
#PCDATA#PCDATA
taxation law*
deduction
. . .
. . .
. . .
. . .
. . .
*
*
* * * *
*
*
Taxation Metadata - Logical Concept Model (I)
2.5.2005 17Heikki Rouhuvirta
income/deduction
internal instruction formula lawcase taxation lawname
description
title instructionyearcourt of law
. . .*
#PCDATA#PCDATA
#PCDATA
#PCDATA
*
. . .
. . .
number
content
* * *
+
+
. . .
Taxation Metadata - Logical Concept Model (II)
2.5.2005 18Heikki Rouhuvirta
… result from register standpoint
Demonstration Report is available on the web at: http://www.stat.fi/org/tut/dthemes/papers/
demoreport_on_taxation_metadata_codacmos_2004.pdf
2.5.2005 19Heikki Rouhuvirta
Taxation register view
Metadata
Tax type code used in the register
Value in euro
Plain-language code (derived or column name)
Structure view
Metadata view
Taxpayer’s tax register record
2.5.2005 20Heikki Rouhuvirta
… and result from statistics standpoint
2.5.2005 21Heikki Rouhuvirta
Disposableincome
Conceptual Definition:The key concept of ”disposable income” in income distribution statistics is arrived atwhen current transfers paid are deducted from gross income.
Operational Definition:The formation of the disposable income of individual or households is as follows:+ Distributed factor income (Wages and salaries , Entrepreneurial income, Property income)+ Current transfers received- Current transfers paid= Disposable income
Wages andsalaries
=variablename
Conceptual Definition:Wages and salaries refer to the compensations as money or benefits in kind re-ceived by householdsor persons during the year. The acquisition costs, excluding travel costs, of wages and salaries arededucted from them. The concept of wages and salaries used in income distribution statisticscomprises pay for regular working hours, as well as overtime pay and income from a secondary job
Operational Definition:The formation of the wages and salaries of individual or households is asfollows:+ wages and salaries in cash+ fringe benefits in kind based on employment relationship+ reimbursement of costs based on employment relationship - wage and salary acquisition costs (excl. travel costs)= wages and salaries
Source: Taxationregister
Income distribution statistics – statistical metadata
2.5.2005 22Heikki Rouhuvirta
Wages
=incomename
Description:Section 13 of the Preliminary Tax Withholding Act defines the conceptof wage as:1) Any wage, commission, benefit or compensation received in anemployment relationship;2) Meeting attendance fee, personal compensation for lecturing, feefor the membership of an administrative organ, managing director'sfee, wage drawn by a partner in a partnership company or limitedpartnership company and compensation received for a position oftrust.
Wage income refers to any pay, fee, partial fee and other benefit orcompensation paid for an office or post, or for work performed for itsprovider against compensation. Additional payments, such as senioritybonus, cost-of-living allowance, bonus for location in an isolated orsparsely populated area, rent allowance, Christmas bonus, giftcommission and percentage of profits, count as wage income.Housing, meal and other fringe benefits, as well as staff benefitssubject to tax also count as wage income.
lawcase
Content Description:Wage paid during period of notice, compensation paid fordismissal without justifiable grounds and retaining pay are alsocounted as wage income.
Ref:Decision of the Supreme Administrative Court 1988/ 2290)
Link to the Decision
Disposable income
Conceptual Definition: …..
Operational Definition: + …(Wages and salaries …
Wages and salaries
Conceptual Definition:Wages and salaries refer …...
Operational Definition: ….+ wages and salaries in cash+ fringe benefits in kind based on employmentrelationship…
= variable
Income distribution statistics – taxation register metadata (I)
statistical metadata
register metadata
2.5.2005 23Heikki Rouhuvirta
Fringebenefits
=incomename Description:
A fringe benefit received from the employer is valued at currentvalue according to detailed annual specifications issued by theNational Board of Taxation. Only the values of the most commonfringe benefits have been confirmed by a decision of the NationalBoard of Taxation.
Instruction:The values of the most common fringe benefits have beenconfirmed by a decision of the National Board of Taxation. Thedecision of the National Board of Taxation on the principles forcalculating fringe benefit values for taxation purposes (fringe benefitdecision) are-company housing [Decision No ...],-right to unlimited consumption of electricity in connection withcompany housing [Decision No...],-company garage [Decision No…],-meal benefit [Decision No…] and-company telephone [Decision No…].
Disposable income
Conceptual Definition: …..
Operational Definition: + …(Wages and salaries …
Wages and salaries
Conceptual Definition:Wages and salaries refer …...
Operational Definition: ….+ wages and salaries in cash+ fringe benefits in kind based on employmentrelationship…
= variable
Link to the Decision
Income distribution statistics – taxation register metadata (II)
statistical metadata
register metadata
2.5.2005 24Heikki Rouhuvirta
The Future
Could it be ….integrated register metadataa genuinely metadata-driven statistical production
processrich metadata is present and available in all production
stages, including editing as well as transforming of register concepts to statistical concepts
metadata accumulates as the process advances without losing old metadata
rich metadata is also available for users during the dissemination process of statistical information
2.5.2005 25Heikki Rouhuvirta
RDBXMLDB
Hand-book
ofRegister
Register Metadata(xml)
data gatheringQuestionnaires
(xml)
collection routinestransactionbased data
storage units baseddata reportwith meta
1° aggregation
data transmission
units and variable based data organisation
data combining
collected datamatrix based on CoSSI
<?xml version="1.0" encoding="iso-8859-1" ?><statmatrix> <matrixtitlegrp> <matrixtitle> <matrixmaintitle>Employees 2001.</matrixmaintitle> </matrixtitle> </matrixtitlegrp> <docmeta> <statxdf> <array> <fieldAxis> <axis axisId="axisvar1" axisIdRef="var1"> <axis axisId="axisvar2" axisIdRef="var2"> <axis axisId="axisvar3" axisIdRef="var3"> <read> <statmeta> <data> </array> </statxdf></statmatrix>
…
…
…
…
xnp
xnj
…
xn2
xn1
.
.
.
.
.
.
.
.
xip
xij
…
xi2
xi1
.
.
.
.
.
.
.
.
x2p
x2j
…
x22
x21
x1p
x1j
…
x12
x11
xp
…
xj
…
x2
x1
a
. n
.
ai
.
.
Variable
Statistical unit
statistical metadatabased on CoSSI
combined data
xml based production system
checked values
conceptual formation new variables
new metadata
data editing
XML based metadata-driven statistical production
2.5.2005 26Heikki Rouhuvirta
Some Practical Steps on the Way
Plan to implement this scheme of things to metadata of other registers (e.g. population register)
Integration of structured statistical metadata system with statistical software packages (e.g. SAS, SuperStar) for simultaneous use