why do i need a data warehouse?

Upload: blumshapiro

Post on 03-Jun-2018

217 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/12/2019 Why Do I Need a Data Warehouse?

    1/3

    Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa

    Why do I need a Data Warehouse?

    Todd Chittenden

    Business Intelligence Architect

    If you can measure that of which you speak and can express it by a number, you know

    something of your subject; but if you cannot measure it, your knowledge is meager and

    unsatisfactory."

    This quote from the 1800s attributed to William Thomson, 1st Baron Kelvin, is often

    paraphrased as a much catchier phrase, which still holds true in modern business:

    " I f you can' t measure it, you can't manage it.

    Businesses measure all sorts of things in order to better manage their business. They measure

    how much a customer has spent on their products, what products have the best margin and how

    long it takes to get a product delivered to the customer after the order is confirmed. You name it,and most assuredly someone or some business out there is measuring it.

    But measurements for a business are only as good as the delivery of this information to thepeople who manage what is being measured. This is where business reports come into play.

    Reports offer consolidation of the items measured. Generally speaking, the higher up in anorganization a manager is, the more consolidated his or her reports need to be.

    Computerized database technologies have been developed over the last few decades to allow

    businesses to store their data in highly efficient electronic formats, a.k.a. databases. Businesses

    rely on database systems geared toward capturing data from the users. These front-end systemsare often referred to as On-Line Transactional Processing systems (OLTP). Most transaction

    systems, in addition to their data entry and capture capabilities, also offer some amount of

    reporting. Even the most basic of reports can offer consolidation and some bit of context. Forexample, How much did we sell last quarter as opposed to our projections? But even the best

    and most complex of OLTP systems will have some reporting shortcomings for one of several

    reasons.

    The concept of a data warehouse was developed to combat some of these issues. A data

    warehouse can be thought of as another database with high volume reporting and analytics as its

    main purpose, as opposed to row-by-row data retrieval and manipulation. They typically contain

    copies of data already managed by transaction systems, but they are designed and indexed forefficient bulk retrieval and reporting. In the following sections, well explore some possible

    29 South Main StreetP.O. Box 272000West Hartford, CT 06127-2000

    Tel 860.561.4000Fax 860.521.9241

    blumshapiro.com

    50 Holden StreetProvidence, RI 02908

    blumshapiro.com

    Tel 401.272.560Fax 401.272.095

  • 8/12/2019 Why Do I Need a Data Warehouse?

    2/3

    Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa

    shortcomings of OLTP systems in regard to reporting and analysis and see how the

    implementation of a data warehouse can alleviate them.

    Report Response TimeOLTP systems can be quite complex. For example, consider the complexity of the database

    structure needed to run an airlines ticketing, passenger boarding and luggage tracking processes.And its this complexity and detail that are often their downfall when it comes to high level

    reports. A ticket agent may be able to tell you how many empty seats are on the next flight, but

    to find the number of empty seats on all the flights this year, compared to the total seatingcapacity (a measure of lost opportunities) may take several minutes or hours to compile.

    A data warehouse may make use of aggregated data sets for faster response times. Summarizing

    the data by day, week or month will drastically reduce reporting times.

    Purged DataOften, aged data is purged from a front-end system. Without such periodic housecleaning, the

    volumes of data to plow through to get to a specific record may become prohibitive. In mostcases, these historical data are archived before they are removed, and stored in other tables,

    databases or even on completely different systems. Reporting on historical data then becomesdifficult at best.

    Such periodic purging of data in a data warehouse is seldom required and often frowned upon.

    Trend reporting, for example by season or month of the year, is then more accurate if there aremore years to average. Global climate databases look at seasonal temperature averages over

    millennia!

    Indexing Strategies

    To pull up a single row out of millions in an OLTP system requires a specific indexing structure.Think of a typical white pages phone book that is indexed by town, then by last name, then byfirst name. How impossible would it be to use such an indexing structure to find all the people

    with the first name of John?

    Data warehouses, on the other hand, employ multiple indexes to allow this type of data

    searching. The local library will most likely have at least two sets of card catalogs: one by author

    and one by subject. Each set of card catalogs can be thought of as an index to the books on the

    shelf.

    Non-Intuitive and Complex Database Structure

    Data warehouses are typically a stripped-down version of their OLTP counterpart. Very seldomis everything in the source system a candidate to be tracked in the data warehouse. For thisreason, data warehouses usually have fewer tables and fewer fields or columns in those tables. In

    addition, multiple tables related to one item in an OLTP system can often be represented by a

    single table in a data warehouse. Having a single table for customers and a single table forproducts, instead of several or dozens for each, make the report designers job much easier.

  • 8/12/2019 Why Do I Need a Data Warehouse?

    3/3

    Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa

    No Insight from Other Sources

    OLTP systems are good at what they do, but sometimes they dont do enough to generate trulymeaningful reports. A system that tracks and monitors manufacturing plant performance may

    have a lot of data points it collects, but the price of the raw materials may not be one of them.

    A data warehouse, on the other hand, can collect price fluctuations from some other externalsources, such as the supply chain, and lay it alongside the performance data for a morerepresentative picture of the plants cost effectiveness.

    Historical Changes are Lost

    Most OLTP systems do not store historical changes to base data (base data being the nouns of

    the business: who and what, as opposed to the how much). A customer may move from one state

    of residence to the next, and all the OLTP system records are where that customer currentlyresides. All transactions for that customer are then associated with where the customer lives now,

    not where they lived at the time of the transaction.

    To solve this issue, a data warehouse will, in the preceding example, have two rows to represent

    that one customer, one row for before they moved, and one for their new place of residence. This

    allows the data warehouse to accurately report truly historical data because the transaction can be

    thought of as staying in the location at which it occurred, instead of transferring to the newlocation.

    SummaryOn-Line Transaction Processing systems have become almost ubiquitous in todays business

    world, though most business users will call them by their varied commercial names. But business

    managers often experience frustration with these systems because they are slow to report large

    volumes of data, lack data past a few months or years, are difficult to understand for customizedreports, lack certain pieces, are simply inaccurate regarding historical changes or any

    combination thereof.

    The solution is a properly designed data warehouse built for speed, comprehensiveness,

    completeness, and, in a word, true business analytics. In future installments on the data

    warehouse theme, well dive into some of the details of how a data warehouse addresses some ofthe issues above.