data dictionaries: the complete a-z - smartstream/media/files/www/newsevents/inthepres… · data...

2
© 2010 Incisive Media Investments. All rights reserved. Used by permission. First published in IRD March 2011 Data Management Data Dictionaries: The Complete A-Z In IT departments across the industry, data officers are asking the same question— how do I manage my data? One solution that should be part of the foundation is a data dictionary—a comprehensive description of all the data within an organization, providing a common key to all the repositories across an enterprise. But will it lead to a universal language? Faye Kilburn reports A–C Acquisition to Complexity The data assets of the average finan- cial firm are in a constant state of flux. Be it through organic growth, merg- ers and acquisitions or restructuring, existing departments are perenni- ally reshuffled and new departments tacked onto the side like conservato- ries. Take the spate of global merg- ers and banking acquisitions since the recession—as you read this, the respective IT departments of these companies are likely to still be work- ing to bridge the gulf between their reference data repositories. In an ideal world, firms would be able to mastermind and execute seamless integrations, but in reality they have had to take a more prag- matic approach. A lot of middle- and back-office systems are so old that it is not worth re-plumbing them to take data from a different source. Therefore, to keep a lid on cost but speed up transition, new systems are added piecemeal to organiza- tions; a practice out of which today’s glutinous, sprawling data repsitories have grown. Making matters worse are a new wave of more complex instruments and products. “Our products used to be managed in silos, but more and more our customers want cross- products,” says Paris-based Pierre- Jean Crouy, head of reference data at European bank Société Générale. “Therefore, in terms of reference data, we are facing an increasing need to bring various legs into a combined product.” D–G Data Management to Governance To deal with this patchwork quilt of problems, a pre-crisis movement emerged towards creating a single golden copy repository. Unfortunately, as firms flocked to acquire what was touted as financial nirvana, they found the associated costs to be too high to warrant investment. Today, the industry’s attitude towards data repositories has taken a u-turn—“the industry can cope with a degree of ambiguity if it means we don’t spend so much,” says London-based Martin Cole, managing director of SIX Telekurs UK. As such, a new objective to have aligned data across all repositories has taken the place of the industry’s zealous obsession with a single hub. Tying in nicely with this shift in the board-room is a post-crash regu- latory focus on proper data gover- nance. “The data governance flag has been waving for a few years, but it was largely unnoticed in the wider organization. Before that, the data guy sat in the corner dealing with data. They weren’t opera- tions, they weren’t IT and no-one really cared about them because they didn’t make any money,” says London-based Hugh Stewart, sales director for SmartStream’s DClear Services business. Now, when a port- folio is priced or capital calculated, the regulator wants a breakdown of what data was used, where it came from and how it was cleaned, so it is vital firms know their data. H–N Homogeneity to Nomenclature This is where a data dictionary comes in. Simply put, a data dictionary is a set of data about the data or rather, definitional standards for what data will be. If a data asset is defined as “a bid price” and that definition is standardized, anything else that is “a bid price” in any other data reposi- tory across the firm will always have the same definition, regardless of its actual label. Hugh Stewart, SmartStream

Upload: lyanh

Post on 13-Apr-2018

218 views

Category:

Documents


5 download

TRANSCRIPT

Page 1: Data Dictionaries: The Complete A-Z - SmartStream/media/Files/www/NewsEvents/InThePres… · Data Dictionaries: The Complete A-Z ... often hard to standardize, to a focus on the

© 2010 Incisive Media Investments. All rights reserved. Used by permission. First published in IRD March 2011

Data Management

Data Dictionaries: The Complete A-ZIn IT departments across the industry, data officers are asking the same question—how do I manage my data? One solution that should be part of the foundation is a data dictionary—a comprehensive description of all the data within an organization, providing a common key to all the repositories across an enterprise. But will it lead to a universal language? Faye Kilburn reports

A–C Acquisition to ComplexityThe data assets of the average finan-cial firm are in a constant state of flux. Be it through organic growth, merg-ers and acquisitions or restructuring, existing departments are perenni-ally reshuffled and new departments tacked onto the side like conservato-ries. Take the spate of global merg-ers and banking acquisitions since the recession—as you read this, the respective IT departments of these companies are likely to still be work-ing to bridge the gulf between their reference data repositories.

In an ideal world, firms would be able to mastermind and execute seamless integrations, but in reality they have had to take a more prag-matic approach. A lot of middle- and back-office systems are so old that it is not worth re-plumbing them to take data from a different source.

Therefore, to keep a lid on cost but speed up transition, new systems are added piecemeal to organiza-tions; a practice out of which today’s glutinous, sprawling data repsitories have grown.

Making matters worse are a new wave of more complex instruments and products. “Our products used to be managed in silos, but more and more our customers want cross-products,” says Paris-based Pierre-Jean Crouy, head of reference data at European bank Société Générale. “Therefore, in terms of reference data, we are facing an increasing need to bring various legs into a combined product.”

D–G Data Management to GovernanceTo deal with this patchwork quilt of problems, a pre-crisis movement emerged towards creating a single golden copy repository. Unfortunately, as firms flocked to acquire what was touted as financial nirvana, they found the associated costs to be too high to warrant investment.

Today, the industry’s attitude towards data repositories has taken a u-turn—“the industry can cope with a degree of ambiguity if it means we don’t spend so much,” says London-based Martin Cole, managing director of SIX Telekurs UK. As such, a new objective to have

aligned data across all repositories has taken the place of the industry’s zealous obsession with a single hub.

Tying in nicely with this shift in the board-room is a post-crash regu-latory focus on proper data gover-nance. “The data governance flag has been waving for a few years, but it was largely unnoticed in the wider organization. Before that, the data guy sat in the corner dealing with data. They weren’t opera-tions, they weren’t IT and no-one really cared about them because they didn’t make any money,” says London-based Hugh Stewart, sales director for SmartStream’s DClear Services business. Now, when a port-folio is priced or capital calculated, the regulator wants a breakdown of what data was used, where it came from and how it was cleaned, so it is vital firms know their data.

H–N Homogeneity to NomenclatureThis is where a data dictionary comes in. Simply put, a data dictionary is a set of data about the data or rather, definitional standards for what data will be. If a data asset is defined as “a bid price” and that definition is standardized, anything else that is “a bid price” in any other data reposi-tory across the firm will always have the same definition, regardless of its actual label.Hugh Stewart, SmartStream

Page 2: Data Dictionaries: The Complete A-Z - SmartStream/media/Files/www/NewsEvents/InThePres… · Data Dictionaries: The Complete A-Z ... often hard to standardize, to a focus on the

Data Management

www.waterstechnology.com/ird March2011

This means that if bid prices are labelled “BIDPRICE” in a Bloomberg feed; “BPRICE” in a Thomson Reuters feed; and “bid” in the firm’s own repository; the data dictionary would enable IT departments to identify them as the same attribute based on their meaning.

“What a data dictionary actually does is allow non-techies to deal with data,” says Stewart—SmartStream has developed a data dictionary as part of its D-Clear offering. “If you look at the operating instructions and the user manuals of data feeds or large software products, it is perfectly complicated. You go from page 9 to page 52 to page 256, and then forget why you were looking in the first place.”

As such, vendors such as SmartStream turn manuals into simple descriptions of each data label, marking a significant shift away from focus on the ‘word,’ which is often hard to standardize, to a focus on the ‘meaning,’ which is not.

O–R Obstacles to ReturnsAlthough data dictionaries are regarded as best practice, there are nuances to their creation that can be overlooked by firms. For example, being sure one data vendor’s “BIDPRICE” correlates with another’s “BPRICE” requires a dedicated team of IT boffins who are skilled in identifying and matching data.

Furthermore, even after a dictionary has been constructed, the process is ongoing. The next merger or acquisition comes along and a whole raft of systems need to be brought into the fold.

However, for all the drawbacks, the data dictionary has the potential to deliver significant returns on investment. “One of the great things about implementing a data dictionary,” says London-based Mike Davis, senior analyst at Ovum (part of the Datamonitor Group), “is there should be a level of return in the fact that a) you should get a better view of your customer and b) you could potentially reduce the number of systems to maintain and support, and ulti-mately the size of IT department.”

In fact, as data dictionaries become more widely used, these efficiency savings are set to grow—vendors such as software provider SmartStream, for example, plan to transform the data dictionary into an open source community of users so that if someone extends the dictionary, the exten-sion can be shared, which reduces resource expenditure for all involved.

Elsewhere, firms are looking to use the dictionary to enhance the business

process itself. “For us, a major incentive in 2011 is to improve our data dictionary by including the business processes the data is contributing to,” says Société Générale’s Crouy.

“Currently, we have a simple yes or no code that says whether we have authoriza-tion to trade with a counterparty or not, but now we are adding more complex condi-tions—for example, the authorization for specific products, and if the counterparty is based in Spain.”

S–U Semantics to UniversalityWhile the outlook for data dictionaries is fairly positive, there is one last worry for the regulator when it comes to external reporting—with banks, asset management firms and vendors all adapting their own data dictionary definitions, how can they be sure how one firm’s definition equates to another?Attempting to address this prob-lem is the EDM Council, an international trade association that has devoted the past two and a half years to compiling a seman-tics repository.

“The semantics repository is slightly different to a data dictionary, in that a data dictionary will define words and terms; whereas our ontology has mandatory and optional data attributes and business relationships described for all the instruments,” says Washington DC-based Mike Atkin, managing director of the EDM Council.

Although the semantics repository is currently an open source activity—which means subscription to its d e f i n i t i o n s are not compul-

sory—there are aspirations to transform it into an industry standard.

“The Council is working with the Object Management Group (OMG) to establish how the repository can be maintained as an OMG standard. We operate in a global envi-ronment, so it makes sense to do it across the industry,” Atkin explains.

V–Z Vendors to ZeitgeistsAt the behest of a standards body, the industry may be dragged towards adopting a universal language and generally speaking, it probably won’t be a bad thing.

There is, however, one missing ingre-dient; with independent trade associa-tions such as the EDM Council calling for industry-wide standardization, should the likes of Bloomberg, Thomson Reuters et al have to conform with their data diction-aries too?

While SIX Telekurs’ Cole thinks it is enough for vendors to make sure their data dictionary is easily available and understandable, the EDM Council’s Atkin says anything that makes it easier for clients to integrate data into the envi-ronments of their customers is good for vendors. “Indeed, a few major ones are mapping our semantics work to their structures,” he adds.

In fairness to both, the discussion may be a little premature. While most firms do have a data dictionary or definitional stan-dards, an industry-wide semantics reposi-tory is still some way off. Even so, the fact

that proper data governance is on the table at all

is a reflection of the times.