Transcript
  • www.analytixds.com

    WHITEPAPER

    www.analytixds.com

    WHITEPAPER

    www.analytixds.com

    BIG DATA Governance How To Avoid The Pitfalls Of Big Data Governance

  • www.analytixds.com

    WHITEPAPER

    Table of Contents

    The need to provide answers quickly........................................................................................ 3

    You cant measure what you dont manage .............................................................................. 3

    Aligning the overall architecture with the problem ................................................................... 4

    Big Data Governance Managing the difference ...................................................................... 5

    So, what really changes?............................................................................................................ 6

    What is really driving all this change? ........................................................................................ 7

    How to build a Big Data Governance solution in an agile way? ................................................. 8

    AnalytiX Mapping Manager Big Data Edition..................................................................... 8

    AnalytiX Big Data Governance Framework ............................................................................. 9

    Addressing Big Data needs with AnalytiX Mapping Manager Big Data Edition ..................12

  • www.analytixds.com

    WHITEPAPER

    "So once you know what the question actually is, you'll know what the answer means." Deep Thought, The Hitch

    Hikers Guide to the Galaxy, Douglas Adams

    The need to provide answers quickly

    Typically we answer questions based upon data and information that is stored in disparate systems, with completely differ-

    ent access methods and stored in completely different places located inside and outside of the corporate walls. Just how do

    you manage the data you need to measure?

    These answers are required in order for us to decide on overall corporate direction, make product marketing and

    pricing decisions (reviewing customer behavior so that we can time offers to get customers back before our competi-

    tors do), commit budget to different projects and to improve planning and performance.

    More questions come to mind quickly. Where does the data come from? Was the data vetted? What is the overall qual-

    ity of this data? Who owns it? Is it documented?

    Add to the mix the fact more answers now require data that up until three or four years ago just wasnt even planned as

    being part of the corporate data warehousing landscape, what got termed Big Data (social network interaction data, voice,

    video, etc.) and now it is so different from everything that came before it, and you now have a major mess in your hands.

    You need to integrate this Big Data in your Business Intelligence (BI) and Corporate Performance Management (CPM) solu-

    tions, and the pressure to deliver it correctly, cheaply and quickly will still be there.

    What was the question again?

    Oh, yes. Just how do you actually manage the data you need to measure? You cant measure what you dont manage

    Lets face it, the future of data management is in multi-structured data sets (highly structured, semi-structured and

    unstructured data) and extremely flexible data acquisition and integration methods spread across multiple platforms.

    In order for you to succeed, you need to examine how Big Data is different from what came before and how it changes how we manage information how leading companies in many industries use this information to achieve better decisions, and performance; and how this data will change information governance and management.

    While some will say this future means the end highly structured data platforms (such as traditional databases and data

    warehouse environments), this is really more about the emergence of a new breed of data management platforms that

    have been built to support multi- structured data and a flexible approach to applying schemas and accessing this data

    from the ground up.

  • www.analytixds.com

    WHITEPAPER

    AnalytiX Mapping Manager was built from the beginning to accelerate the successful development and deployment of a successful Business Intelligence solutions, but it can also accelerate the delivery of data (all sorts of data structured, semi-structured as well as unstructured) into this new breed of data management platform.

    The strength of AnalytiX Mapping Manager lies at the heart of its architecture highly adaptive and lean, capable of quickly aligning the fluid needs of the business with the structure imposed by Information Technology and accelerating successful Business Intelligence and Analytical solutions.

    Aligning the overall architecture with the problem

    At the heart of its architecture, AnalytiX Mapping Manager is a complete requirements management solution which allows analysts to capture data, business and technical requirements and from these requirements, these analysts and data scientist are able to quickly build an initial architecture model that, coupled with the metadata management capabilities native to AnalytiX Mapping Manager can bring to bear an awesome arsenal into the Big Data battlefield.

    This deluge of data we are calling Big Data is a familiar theme to any IT department today. More and more Big Data

    initiatives start every day and most of them start without much of a roadmap or a safety net.

    But, when it comes to Big Data initiatives, however, the challenge isnt simply the fact they start without a roadmap or

    lack a safety net, or even the growing amount of data, but the variety of data that must be consumed, compiled, governed

    and analyzed.

    There are many ways to define Big Data, but the consensus is that the typical definition is based in terms of the four Vs of

    Big Data:

    The Volume of information (usually in the billions of interactions);

    The Variety of information (from structured, semi-structured and non-structured data);

    The Velocity of information (typically the speed at which data becomes available and can be analyzed); and

    The Value of information (which can seldom be determined before the data is acquired, loaded, compiled and thoroughly analyzed).

    While these four Vs may drive organizations to deploy new technologies, techniques and methods to cope with these new requirements, these initiatives are usually incomplete and incompatible with their existing business intelligence and analytics infrastructures, seldom taking in consideration the requirements of information governance in Big Data. Remember: You cant measure - what you dont manage!

    Big Data forces organizations to take a different look at how you integrate this data into the existing Business

    Intelligence and Analytics landscape. Big Data Governance forces these organizations to rethink how you manage all

    this data.

    What was the question again?

    Oh, yes. Just how do you actually manage the data you need to measure?

  • www.analytixds.com

    WHITEPAPER

    Big Data Governance Managing the difference

    Big Data comes into play from the need to take Business Intelligence and Analytics well beyond the great systems of record (yes, we need this to too!), which are focused on efficiency to a whole new breed of solutions centered around effectiveness what is now being termed the systems of engagement (the systems that hold the data from all other corporate systems).

    With future solutions clearly evolving from data that comes from highly efficient and well behaved systems to data that comes from the extended enterprise, on where data is not necessarily so well structured and behaved, forcing the organization into a more collaborative mode of operation, with its core infrastructure being adapted from the consumer space.

    Is Big Data Governance really that different from traditional data governance initiatives?

    While the needs, reasons and benefits of governance remain the same whether you are dealing with Big Data or with small data, to see and understand what really changes we must go back to the four Vs of Big Data (Volume, Variety, Velocity and Value) and understand what really changes with Big Data Governance:

    Volume. With extremely large data volumes, the need for highly sophisticated tools to assess and profile data becomes a necessity as traditional approaches of profiling and managing data primarily through observation becomes unfeasible;

    Variety. The different types of data associated with Big Data (video, voice, sensor readings, images, social network feeds, etc.), definitely require us to look at storing all of this in something other than a traditional relational database and govern this data according to the types of data we are needing to measure

    Velocity. The different latency requirements for intake and consumption, analysis, integration and publication or

    promotion of the information make it necessary for the raw data to be available for analysis in real-time, with data

    quality and database load schedules pretty much having to take a back seat to pressures of getting to market quickly.

    Value. Typically, with Big Data, the Value of information can only be determined once the data can has been acquired,

    loaded, compiled and thoroughly analyzed. This goes directly against what we, for so long, held as one of the tenets for

    (small) data governance that we would have a significant investment in time and effort, spending a considerable

    amount of time designing data models and building out the data integration architecture before the first row of data is

    acquired.

    So, what really changes? Just as the navigational, pointer-driven data structures lost out to the current standard, with highly structured, well behaved data in the relational data model, the current requirements for flexibility of structure and access are moving Business Intelligence and Analytics solutions to a world on where data will have to be consumed in a very dynamic way. And that forces changes to the approach with which we govern Big Data.

    This approach demanded by Big Data Governance requires you to be flexible and as adaptive as the data you are

    consuming, but that doesnt mean you have a get out of jail free card quite the contrary, you have to be just as

    diligent as you have always been, you just have to be quicker and more flexible than you have ever been.

    Data will continue to grow at staggering rates, and you will need new methods and tools that are capable of dealing with

    this unbridled growth. There is growing awareness of the need for data governance, but there is also an even bigger

    challenge in defining who owns the data and identifying all the various siloes in which the data is contained. It will now

  • www.analytixds.com

    WHITEPAPER

    Fig 1 AnalytiX Mapping Manager end to end Big Data Governance

    Consider the image above which depicts a typical corporate architecture in a BIG Data Governance environment. Each of the Data Sources (internal & external) need to be managed and the data federation processes which move data between environments and outside partners must be managed and governed.

    What is really driving all this change?

    The simple answer would depend on what you do. But, while the needs tend to be different, the driver behind it tends to converge to the need to analyze data in support of the extended enterprise.

    Just as Government has different needs than those found in private industry, organizations in the same industry seg-ment might have needs that are as different and completely divergent form one another but they all point to the need to evolve and support the extended enterprise and how this need is forcing organizations everywhere to in-crease their competitiveness or effectiveness or suffer commoditization of the goods and services they provide.

    become increasingly more important to manage and govern all the siloed data stores both inside and outside of the

    enterprise and, more importantly, govern them so it is understandable how new answers to business questions are

    being an swerved in this rapidly changing world of Big Data.

  • www.analytixds.com

    WHITEPAPER

    Businesses and government agencies need to be much more cooperative today than ever before in order to make the chains of extended enterprise partnerships perform as expected. Supply and delivery chains, social media sites,

    customer service and support centers, government agencies and the like all need to tap and manage this unstructured data from these new sources. While this is a lot more difficult to organize and tag than structured data from transactional databases, it is not an impossible task.

    Recent research indicates this unstructured data to be growing at alarming rates (greater than 65% per year in most cases), and the places this growth is coming from are not necessarily new, and while most Data Governance Methods, Architec-tures and Tools focus on the structured data kept in databases of business applications, reality is that up to 90% of the rele-vant corporate information is contained in semi-structured and unstructured data stores (pdfs, contracts, weblogs, product manuals, instructional videos, images, spreadsheets and the backbone of the personal archive system: the mail system). And not all of it is inside the corporate walls anymore.

    As users demand even more access to data, mobile devices (notebooks, PDA's, Smartphones, tablets, etc.) now store up to 60% of the typical organizations data, creating a compliance and governance nightmare. Issues range from having a large number of copies of a single document around stored around in primary, secondary, archive, backup and mobile environ-ments (making it almost impossible to tell what version was the last version is time stamps are not adequate anymore) to allowing serious privacy and security breaches to occur.

    How to build a Big Data Governance solution in an agile way?

    The rise in the use of non-traditional data sources and the changes to the way Big Data is being used in solutions to increase not efficiency, but effectiveness reaffirms the need for Big Data to be managed and governed using a formal data management discipline within a formal enterprise architecture program, using tools that are specialized, flexible and well adept to working in a highly fluid environment as the Big Data environment.

    On our initial white paper we focused around the use of AnalytiX Mapping Manager as an enabler for the data govern-ance program. Today we will focus on using AnalytiX Mapping Manager as an enabler for an end to end Enterprise Gov-ernance solution which addresses all facets of the enterprise, including the need to govern Big Data along with traditional structured data.

    AnalytiX Mapping Manager is much more than what the name implies: a data mapping tool or a mapping management

    tool. AnalytiX Mapping Manager is an enterprise-level framework and source to target mapping tool that allows the

    management of all metadata related to sources, targets and business rules associated with the data and information

    reporting needs of the enterprise as well as a centralized metadata repository capable of improving the speed with which

    an organization deploys a Business Intelligence solution as well as a very capable data governance tool. One that allows

    any data governance program to define ownership, standards, definitions, processes, policies, security requirements not

    only around well behaved, highly structured data, but also around semi-structured and non- structured data commonly

    found in what we today term: Big Data.

  • www.analytixds.com

    WHITEPAPER

    AnalytiX Mapping Manager Big Data Edition

    What is AnalytiX Mapping Manager Big Data Edition and how can it help deploy a complex governance solu-

    tion that deals with both Traditional and Big Data Governance problems?

    AnalytiX Mapping Manager Big Data Edition integrates well into the Big Data governance strategy of any organization

    by extending its already powerful centralized repository to support current leading Big Data technologies and supporting a

    very flexible and highly scalable structure to support Big Data constructs.

    One of the key benefits of AnalytiX Mapping Manager Big Data Edition is the ability to support the acquisition of un-structured data file definitions without having to create a schema first or acquiring the entire data set first you simply point over to the right data file, acquire the technical metadata and take in the data you are interested in, evaluate and make changes to the data requirements. Once you are satisfied with your data set, you can simply acquire your data into Hadoop and allow the consuming programs to determine and apply structure when necessary. AnalytiX Mapping Man-ager Big Data Edition will track and document these new structures from a centralized metadata repository which is available to all consuming applications through a simple Application Program Interface which exposes the central meta-data repository.

    Using AnalytiX Mapping Manager Big Data Edition you will be able to manage all operational system metadata includ-ing the unstructured data stores and silos inside and outside the organization as well as manage the business and transfor-mation rules in order to understand how this data is mapped and analyzed to support the corporate decision making proc-ess. AnalytiX Mapping Manager Big Data Edition also features advanced impact analysis and visual data lineage ana-lyzer to analyze how data is federated inside and outside of the organization.

    Fig 2 AnalytiX Mapping Manager provides an end to end Enterprise Governance solution

  • www.analytixds.com

    WHITEPAPER

    The need for customization, the Mapping Manager Framework features an open metadata repository for customized

    integration and analysis of metadata and data mappings. The framework supports a set of out of the box plug-in mod-

    ules as well as the ability for customers to create custom in-house application that integrate with the open repository

    to further analyze, integrate data or create customized views of the repository data.

    AnalytiX Big Data Governance Framework

    A strong Data Governance program framework is crucial to the success of any big data initiative and the management of

    that data. If the program is implemented early on and uses the right organizational, technical, metadata and business proc-

    esses, enterprises will be able to realize the promises of big data processing and reduce the risk associated with such a pro-

    gram from a cost and time perspective.

    This is just the beginning, in order to succeed in this, businesses need to follow a known path and use the proper tools to

    bring order and governance to Big Data. These include addressing security, privacy, consistency, reliability and repeatabil-

    ity from a single tool.

    Fig 3 AnalytiX DIAL-M Framework for Big Data Governance

  • www.analytixds.com

    WHITEPAPER

    The highly simplified AnalytiX DIAL-M for Big Data Governance framework model above demonstrates how a team of

    architects, analysts, developers and testers can bring a Business Intelligence Solution to market. In order to bring agility to

    the Business Intelligence solution, AnalytiX Mapping Manager allows the alignment of business and IT needs, the quick

    deployment of solutions that incorporate both traditional small and Big data, at the same time, addressing key enterprise

    needs, such as;

    Documentation of data, business and technical requirements and automatic generation of a logical data model from

    these requirements

    Documentation and regulatory compliance (such as Basel, PHI,SPI, Healthcare, SOX, Solvency II etc )of data map-

    pings, aggregations and transformation rules that have to be highly adaptive as requirements change;

    Definition, documentation and creation of a central repository of business terms, rules, data elements, KPIs and met-

    rics

    Establishment of data and information requirements that span multiple data sources (including non-traditional data

    sources), mixed latency and extreme data volumes;

    Creation of detailed physical data model for Business Intelligence and Business Analytics Solutions

    AnalytiX Mapping Manager Big Data Edition can easily provide, by virtue of its architecture, significant improvement

    over traditional methods that use spreadsheets, documents or decoupled metadata management solutions by coordinat-

    ing and governing all aspects of Big Data Governance.

    AnalytiX Mapping Manager Big Data Edition improves efficiency in processes, people and solution deployment not

    only in the analysis, design and development phases of integration, but also in downstream work flow management and

    regulatory compliance and reporting as we have seen in previous articles.

    Typically, all metadata management tools provide some sort of repository and can import and export data definitions

    from sources and targets. What they typically lack is a set of well defined processes to accelerate development and de-

    ployment and the ability to manage change.

    AnalytiX Mapping Manager Big Data Edition bridges the gap between leveraging metadata, and making the data

    definition and acquisition processes automated, disciplined and predictable to all data enterprise stakeholders via its web

    enabled portal and customized business oriented views. For those customers demanding more flexibility and customiza-

    tion the framework allows for customers to create custom modules which interact directly with the open repository and

    plug-in directly to the user interface. An ecosphere of free and third party plug-ins are available to extend the Mapping

    Managers out of the box governance framework.

  • www.analytixds.com

    WHITEPAPER

    Fig 4 AnalytiX Custom Application Plug-In Framework

    Addressing Big Data needs with AnalytiX Mapping Manager Big Data Edition

    If you recall, we started this White Paper by quoting Deep Thought "So once you know what the question

    actually is, you'll know what the answer means."

    What was the question again? Oh, yes. Just how do you actually manage the data you need to measure?

    First you have to acquire your data, be that big data or small data and bring it into the fold, then you have to

    create a governance program which addresses all needs and provides the benefits that align business and IT,

    promote the right architecture (both flexible and scalable) and accelerate delivery by reducing the time to mar-

    ket requirements and increasing the quality of the overall solution.

    There are varied benefits from using AnalytiX Mapping Manager to accelerate deployment of a Business In-

    telligence solution, by providing the following:

    the ability to run quickly build conceptual, logical and physical data models from captured

    requirements,

    the ability to create, maintain and consolidate data dictionaries for all enterprise systems, the ability to

    quickly map source to target maps (STM) and use these STM documents to automatically generate data

    integration processes.

  • www.analytixds.com

    WHITEPAPER

    the ability to quickly analyze data lineage and evaluate impact and thus manage and communicate change

    efficiently and effectively. Especially in high regulated industries such as healthcare and finance where

    regulatory compliance is required surrounding data and how it is governed.

    Fig 5AnalytiX Mapping Manager Data Governance Framework

  • Contact Us

    For further information, please contact AnalytiX Data Services Corporate HQ 14175 Sullyfield Circle, Suite # 400 Chantilly, VA 20151 USA Tel: +1 (800)-656-9860 Website: www.analytixds.com Email: [email protected] SALES For Sales - Please call (800)-603-4790 (9am-5pm EST) or Email: [email protected] SUPPORT For Product Support - Please call (800)-617-9620 or Email: [email protected]

    Copyright 2014 | All Rights Reserved.


Top Related