francis gross dg statistics, external statistics division 6 & 7 may 2010 committee for the...
TRANSCRIPT
Francis GrossDG Statistics, External Statistics Division
6 & 7 May 2010Committee for the Coordination of Statistical Activities
Conference on Data Quality for International Organisations
Data Quality Management for Securities
–
the case of the Centralised Securities DataBase
2
Financial statistics based on micro data• Statistics based on individual securities data
– Micro-data generated from business processes– Classifications by statisticians
• Benefits– Flexibility in serving event-driven policy needs, near time– Ability to drill down, linking macro- to micro-issues
• Tool: the Centralised Securities Database (CSDB) – Holds data on nearly 7 million securities– Is in production with 27 National Central Banks online
Macro-statistics on“who finances whom”
and “how”
Securities data
Issuer data
Holdings data ** aggregated by economic sector and country of residence
3
Centralised Securities Database (CSDB)
• CSDB provides consistent and up-to-date reference, income, price and volume data on individual securities
• CSDB is shared by the European System of Central Banks (ESCB)
• CSDB is intended to be the backbone for producing consistent and harmonised securities statistics
• CSDB plays a pivotal role in s-b-s* reporting as the reference database for the ESCB & associated institutions
* s-b-s: security by security
4
Main features of the CSDB
• Multi-source systemFor coverage and quality: data from several providers
(5 commercial data providers, National Central Banks (NCBs))
• Daily Update frequency 2 million price- & 1 million reference data records per day
• Automated construction of a “golden copy” Data on an instrument are grouped using algorithms; most plausible attribute values are selected
• Data Quality Management (DQM) NetworkStaff from all NCBs contribute to DQMThrough human intervention and increasingly systems, toCheck “raw” data and validate “golden copy” results
5
Data quality – two pillars
• Tackle issue at source• UPSTREAM• Preferably in bulk• Centralised process
Data Source Management
• Tackle issue at NCB• DOWNSTREAM• Individual securities• Decentralised process
Data Quality Management
• The data quality of the CSDB depends heavily on:
• Data quality managers face two critical problems:
– What data is correct? Where to look for the “truth”?
(verify vs prospectus, internal databases, Google…)
– How can dubious data be identified?
6
Metrics to steer and prioritise DQM
• Problem: no access to benchmark data
• Macro metrics:
– distribution of different reference data attributes (eg price, income)
• Micro Metrics:
– inter-temporal comparison (stability index, concentration change)
– consistency checks
– can drill down to level of individual security
Strategy: identification, quantification, prioritisation
7
Example: metric for change in country / sector
• The system calculates indices of stability in country / sector for the relevant group of securities between t0 and t1
• An index for a country / sector pair below 1 shows change in the group of securities (country or sector or both)
• 2 indices are calculated: from the perspective t0 (sees leavers) and from t1 (sees joiners) in a country/ sector group
Security 1 Security 2 Security 3
Security 2 Security 3 Security 4
T0 NL/S.11
T1 NL/S.11
stays stays
joins sector/ countr
y
leaves sector/ countr
y
t0 perspective (Laspeyres concept) covers leavers but no joiners
t1 perspective (Paasche concept) covers joiners but no leavers
Join both concepts Fisher index = PaascheLaypeyres indexindex
8
19 instruments, of which 14 kept their issuer identifier (CH_IDENT = 1)
9
CSDB Data Quality Management Made us face a choice:
eitherSYSYPHUS
or CHANGE,
Change beyond Statistics
10
Finding data quality further upstream
Prospectus “perfect”, public data source, but no common language
Commercial data sets error prone, selective, costly
production, duplicate efforts, proprietary
formats
Compounding a costly
process
Data Quality Management & defaulting
costly
Compound first shot
Golden Copy after DQM &
defaulting, still not back to perfect -
and not standard.
Candidates for
compounding, a costly
collectionDuplication and non-
standardisation in the very data generation
process hamper the whole downstream value chain.
15
Where to start?
The first layer of data out of reality.
Its generation process matters
16
Data capture drives IT output quality
Large scale IT processing can be simple and cheap when data fulfils the programmers’ quality assumptions.
Messy data capture delivers “garbage in, garbage out”.
• Once good data is in the system, processing can work well.
• Data capture from the “real” world is the key step.
• Once lost at capture, information in data is lost.
• No “data cleaning” will help: data must be captured again.
• Messy data capture at source is very expensive downstream:– Most applications perform badly
– “Data cleaning” and fixing failed processes are costly for all
– Processes and IT must be designed in complicated ways
19
Progress is on its
way
20
“…a standard for reference data on securities and issuers, with the aim of making such data available to policy-makers, regulators and the financial industry through an international public infrastructure.” (J.C. Trichet, 23.2.09)
25
The industry expresses demand for a Utility• Industry panel at Conference 15 Feb 10 in London:
– “An international Utility for reference data has its place, but
– Keep it simple, (concept of a “Thin Utility”)
– Ask industry to design the standards (ISO does exactly that) and
– Give us the legal stick”
A viable reference data infrastructure benefits from constructive dialogue.
26
From browsing to farming for data:
The long way to standardisation
27
Climbing the stairway to action
Business leaders, Policy makers, Regulators & Legislatorsnow embrace the dialogue with the Data Community
Understand the role of data as a necessary infrastructure
Understand how basic data is generated
Understand basic data as a shared strategic resource
Understand dynamics of standardisation
Imagine a feasible way; accept that way as useful
Accept the issue among priorities
Design a legal framework
Build into data ecosystem
Build the business case with all stakeholders
Imagine solutions addressing legacy
38
“Thin” Utility
39
“Thin Utility”: a unique, shared reference frame
• Two registers: one for instruments, one for entities
• Simple and light, complete and unequivocal• Hard focus on identification and minimal
description• The shared infrastructure of basic reference
data for:– Data users in the financial industry– Data vendors– Authorities– The Public
• An internationally shared infrastructure of reference
A “Thin Utility” provides the certainty of a single source on known, bare basics.
40
Refer
ence
Data
Utility
Two reference registers: the Thin Utility’s frame
Register of instruments
Reg
iste
r o
f en
titi
es
• unique identifier, • key attributes, • interrelations, • classifications,
41
Refer
ence
Data
Utility
Two reference registers: the Thin Utility’s frame
Register of instruments
Reg
iste
r o
f en
titi
es
• unique identifier, • key attributes, • interrelations, • classifications,• electronic contact address
42
The Utility grows from a quickly feasible base
Refer
ence
Data
Utility
Register of instruments
Reg
iste
r o
f en
titi
esFor both registers:• begin with feasible scope,• grow over time by • adding instruments & entities,• adding attribute classes,• driven by demand• from industry and authorities,• and by feasibility
52
The international
aspect
53
Global Utility vs. National Law: an option
International Operational Entity
Utility
Service agreements with national authorities
Runs the service: • Collects data• Distributes / sells data• Certifies analysts• Monitors compliance• Informs national authorities• Releases new standard items
Governance of the Operational Entity • Global „tour de table“ (IMF, BIS, industry, etc.)• Establishment of Int’l Operational Entity• Seed funding of Int’l Operational Entity ?
International Institutionse.g. G20: • Discusses new regulatory framework for financial markets• Defines principles / goals for data
International Community
National Legislator
National Authority
Entity
Issues law: • mandates national authority• empowers it to enforce the
process and • to farm out operations to an
international entity.• (EU issues specific EU law)
Executes the legal mandate:
• Farms out operations to the Int’l Operational Entity
• Monitors compliance• Enforces, applies sanctions
Complies with national law:• Delivers and maintains data
in the Utility as required, possibly using services.
National Constituency
Develops/maintains standards: • Designs initial standards• Monitors market developments• Steers evolution of standards• Designs new standard items
Standards College
Ideally,ISO-based
56
Positioning and
Design
59
Positioning in the data supply chain
Issuer
CDP
Data User
Utility
Initially,the downstream supply chain remains untouched, except for quality:data users don’t need to invest
Competitive market
Public
Policy makers
Regulators
Option: Utility as
Tier 1
61
Organisational modelValue chain
Utility value chain: monopoly vs. competition
Standards:- design- setting
Analyst trainingAnalyst
certificationData production
Data distribution:- primary
- secondary
Multilateral (ISO?)
MonopolyCompetitionMonopolyCompetition
MonopolyCompetition
Each stage of the value chain shouldbe given the most efficient organisation