avoid big data governance pitfalls

Download Avoid Big Data Governance Pitfalls

Post on 12-Nov-2015

8 views

Category:

Documents

3 download

Embed Size (px)

DESCRIPTION

AnalytiX™ Mapping Manager™ Big Data Edition™ integrates well into the Big Data governancestrategy of any organization by extending its already powerful centralized repository to support current leading Big Data technologies and supporting a very flexible and highly scalable structure to support Big Data constructs.

TRANSCRIPT

  • www.analytixds.com

    WHITEPAPER

    www.analytixds.com

    WHITEPAPER

    www.analytixds.com

    BIG DATA Governance How To Avoid The Pitfalls Of Big Data Governance

  • www.analytixds.com

    WHITEPAPER

    Table of Contents

    The need to provide answers quickly........................................................................................ 3

    You cant measure what you dont manage .............................................................................. 3

    Aligning the overall architecture with the problem ................................................................... 4

    Big Data Governance Managing the difference ...................................................................... 5

    So, what really changes?............................................................................................................ 6

    What is really driving all this change? ........................................................................................ 7

    How to build a Big Data Governance solution in an agile way? ................................................. 8

    AnalytiX Mapping Manager Big Data Edition..................................................................... 8

    AnalytiX Big Data Governance Framework ............................................................................. 9

    Addressing Big Data needs with AnalytiX Mapping Manager Big Data Edition ..................12

  • www.analytixds.com

    WHITEPAPER

    "So once you know what the question actually is, you'll know what the answer means." Deep Thought, The Hitch

    Hikers Guide to the Galaxy, Douglas Adams

    The need to provide answers quickly

    Typically we answer questions based upon data and information that is stored in disparate systems, with completely differ-

    ent access methods and stored in completely different places located inside and outside of the corporate walls. Just how do

    you manage the data you need to measure?

    These answers are required in order for us to decide on overall corporate direction, make product marketing and

    pricing decisions (reviewing customer behavior so that we can time offers to get customers back before our competi-

    tors do), commit budget to different projects and to improve planning and performance.

    More questions come to mind quickly. Where does the data come from? Was the data vetted? What is the overall qual-

    ity of this data? Who owns it? Is it documented?

    Add to the mix the fact more answers now require data that up until three or four years ago just wasnt even planned as

    being part of the corporate data warehousing landscape, what got termed Big Data (social network interaction data, voice,

    video, etc.) and now it is so different from everything that came before it, and you now have a major mess in your hands.

    You need to integrate this Big Data in your Business Intelligence (BI) and Corporate Performance Management (CPM) solu-

    tions, and the pressure to deliver it correctly, cheaply and quickly will still be there.

    What was the question again?

    Oh, yes. Just how do you actually manage the data you need to measure? You cant measure what you dont manage

    Lets face it, the future of data management is in multi-structured data sets (highly structured, semi-structured and

    unstructured data) and extremely flexible data acquisition and integration methods spread across multiple platforms.

    In order for you to succeed, you need to examine how Big Data is different from what came before and how it changes how we manage information how leading companies in many industries use this information to achieve better decisions, and performance; and how this data will change information governance and management.

    While some will say this future means the end highly structured data platforms (such as traditional databases and data

    warehouse environments), this is really more about the emergence of a new breed of data management platforms that

    have been built to support multi- structured data and a flexible approach to applying schemas and accessing this data

    from the ground up.

  • www.analytixds.com

    WHITEPAPER

    AnalytiX Mapping Manager was built from the beginning to accelerate the successful development and deployment of a successful Business Intelligence solutions, but it can also accelerate the delivery of data (all sorts of data structured, semi-structured as well as unstructured) into this new breed of data management platform.

    The strength of AnalytiX Mapping Manager lies at the heart of its architecture highly adaptive and lean, capable of quickly aligning the fluid needs of the business with the structure imposed by Information Technology and accelerating successful Business Intelligence and Analytical solutions.

    Aligning the overall architecture with the problem

    At the heart of its architecture, AnalytiX Mapping Manager is a complete requirements management solution which allows analysts to capture data, business and technical requirements and from these requirements, these analysts and data scientist are able to quickly build an initial architecture model that, coupled with the metadata management capabilities native to AnalytiX Mapping Manager can bring to bear an awesome arsenal into the Big Data battlefield.

    This deluge of data we are calling Big Data is a familiar theme to any IT department today. More and more Big Data

    initiatives start every day and most of them start without much of a roadmap or a safety net.

    But, when it comes to Big Data initiatives, however, the challenge isnt simply the fact they start without a roadmap or

    lack a safety net, or even the growing amount of data, but the variety of data that must be consumed, compiled, governed

    and analyzed.

    There are many ways to define Big Data, but the consensus is that the typical definition is based in terms of the four Vs of

    Big Data:

    The Volume of information (usually in the billions of interactions);

    The Variety of information (from structured, semi-structured and non-structured data);

    The Velocity of information (typically the speed at which data becomes available and can be analyzed); and

    The Value of information (which can seldom be determined before the data is acquired, loaded, compiled and thoroughly analyzed).

    While these four Vs may drive organizations to deploy new technologies, techniques and methods to cope with these new requirements, these initiatives are usually incomplete and incompatible with their existing business intelligence and analytics infrastructures, seldom taking in consideration the requirements of information governance in Big Data. Remember: You cant measure - what you dont manage!

    Big Data forces organizations to take a different look at how you integrate this data into the existing Business

    Intelligence and Analytics landscape. Big Data Governance forces these organizations to rethink how you manage all

    this data.

    What was the question again?

    Oh, yes. Just how do you actually manage the data you need to measure?

  • www.analytixds.com

    WHITEPAPER

    Big Data Governance Managing the difference

    Big Data comes into play from the need to take Business Intelligence and Analytics well beyond the great systems of record (yes, we need this to too!), which are focused on efficiency to a whole new breed of solutions centered around effectiveness what is now being termed the systems of engagement (the systems that hold the data from all other corporate systems).

    With future solutions clearly evolving from data that comes from highly efficient and well behaved systems to data that comes from the extended enterprise, on where data is not necessarily so well structured and behaved, forcing the organization into a more collaborative mode of operation, with its core infrastructure being adapted from the consumer space.

    Is Big Data Governance really that different from traditional data governance initiatives?

    While the needs, reasons and benefits of governance remain the same whether you are dealing with Big Data or with small data, to see and understand what really changes we must go back to the four Vs of Big Data (Volume, Variety, Velocity and Value) and understand what really changes with Big Data Governance:

    Volume. With extremely large data volumes, the need for highly sophisticated tools to assess and profile data becomes a necessity as traditional approaches of profiling and managing data primarily through observation becomes unfeasible;

    Variety. The different types of data associated with Big Data (video, voice, sensor readings, images, social network feeds, etc.), definitely require us to look at storing all of this in something other than a traditional relational database and govern this data according to the types of data we are needing to measure

    Velocity. The different latency requirements for intake and consumption, analysis, integration and publication or

    promotion of the information make it necessary for the raw data to be available for analysis in real-time, with data

    quality and database load schedules pretty much having to take a back seat to pressures of getting to market quickly.

    Value. Typically, with Big Data, the Value of information can only be determined once the data can has been acquired,

    loaded, compiled and thoroughly analyzed. This goes directly against what we, for so long, held as one of the tenets for

    (small) data governance that we would have a significant investment in time a