why do i need a data warehouse?
TRANSCRIPT
-
8/12/2019 Why Do I Need a Data Warehouse?
1/3
Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa
Why do I need a Data Warehouse?
Todd Chittenden
Business Intelligence Architect
If you can measure that of which you speak and can express it by a number, you know
something of your subject; but if you cannot measure it, your knowledge is meager and
unsatisfactory."
This quote from the 1800s attributed to William Thomson, 1st Baron Kelvin, is often
paraphrased as a much catchier phrase, which still holds true in modern business:
" I f you can' t measure it, you can't manage it.
Businesses measure all sorts of things in order to better manage their business. They measure
how much a customer has spent on their products, what products have the best margin and how
long it takes to get a product delivered to the customer after the order is confirmed. You name it,and most assuredly someone or some business out there is measuring it.
But measurements for a business are only as good as the delivery of this information to thepeople who manage what is being measured. This is where business reports come into play.
Reports offer consolidation of the items measured. Generally speaking, the higher up in anorganization a manager is, the more consolidated his or her reports need to be.
Computerized database technologies have been developed over the last few decades to allow
businesses to store their data in highly efficient electronic formats, a.k.a. databases. Businesses
rely on database systems geared toward capturing data from the users. These front-end systemsare often referred to as On-Line Transactional Processing systems (OLTP). Most transaction
systems, in addition to their data entry and capture capabilities, also offer some amount of
reporting. Even the most basic of reports can offer consolidation and some bit of context. Forexample, How much did we sell last quarter as opposed to our projections? But even the best
and most complex of OLTP systems will have some reporting shortcomings for one of several
reasons.
The concept of a data warehouse was developed to combat some of these issues. A data
warehouse can be thought of as another database with high volume reporting and analytics as its
main purpose, as opposed to row-by-row data retrieval and manipulation. They typically contain
copies of data already managed by transaction systems, but they are designed and indexed forefficient bulk retrieval and reporting. In the following sections, well explore some possible
29 South Main StreetP.O. Box 272000West Hartford, CT 06127-2000
Tel 860.561.4000Fax 860.521.9241
blumshapiro.com
50 Holden StreetProvidence, RI 02908
blumshapiro.com
Tel 401.272.560Fax 401.272.095
-
8/12/2019 Why Do I Need a Data Warehouse?
2/3
Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa
shortcomings of OLTP systems in regard to reporting and analysis and see how the
implementation of a data warehouse can alleviate them.
Report Response TimeOLTP systems can be quite complex. For example, consider the complexity of the database
structure needed to run an airlines ticketing, passenger boarding and luggage tracking processes.And its this complexity and detail that are often their downfall when it comes to high level
reports. A ticket agent may be able to tell you how many empty seats are on the next flight, but
to find the number of empty seats on all the flights this year, compared to the total seatingcapacity (a measure of lost opportunities) may take several minutes or hours to compile.
A data warehouse may make use of aggregated data sets for faster response times. Summarizing
the data by day, week or month will drastically reduce reporting times.
Purged DataOften, aged data is purged from a front-end system. Without such periodic housecleaning, the
volumes of data to plow through to get to a specific record may become prohibitive. In mostcases, these historical data are archived before they are removed, and stored in other tables,
databases or even on completely different systems. Reporting on historical data then becomesdifficult at best.
Such periodic purging of data in a data warehouse is seldom required and often frowned upon.
Trend reporting, for example by season or month of the year, is then more accurate if there aremore years to average. Global climate databases look at seasonal temperature averages over
millennia!
Indexing Strategies
To pull up a single row out of millions in an OLTP system requires a specific indexing structure.Think of a typical white pages phone book that is indexed by town, then by last name, then byfirst name. How impossible would it be to use such an indexing structure to find all the people
with the first name of John?
Data warehouses, on the other hand, employ multiple indexes to allow this type of data
searching. The local library will most likely have at least two sets of card catalogs: one by author
and one by subject. Each set of card catalogs can be thought of as an index to the books on the
shelf.
Non-Intuitive and Complex Database Structure
Data warehouses are typically a stripped-down version of their OLTP counterpart. Very seldomis everything in the source system a candidate to be tracked in the data warehouse. For thisreason, data warehouses usually have fewer tables and fewer fields or columns in those tables. In
addition, multiple tables related to one item in an OLTP system can often be represented by a
single table in a data warehouse. Having a single table for customers and a single table forproducts, instead of several or dozens for each, make the report designers job much easier.
-
8/12/2019 Why Do I Need a Data Warehouse?
3/3
Blum, Shapiro & Company, P.C. An independent Member of Baker Tilly Internationa
No Insight from Other Sources
OLTP systems are good at what they do, but sometimes they dont do enough to generate trulymeaningful reports. A system that tracks and monitors manufacturing plant performance may
have a lot of data points it collects, but the price of the raw materials may not be one of them.
A data warehouse, on the other hand, can collect price fluctuations from some other externalsources, such as the supply chain, and lay it alongside the performance data for a morerepresentative picture of the plants cost effectiveness.
Historical Changes are Lost
Most OLTP systems do not store historical changes to base data (base data being the nouns of
the business: who and what, as opposed to the how much). A customer may move from one state
of residence to the next, and all the OLTP system records are where that customer currentlyresides. All transactions for that customer are then associated with where the customer lives now,
not where they lived at the time of the transaction.
To solve this issue, a data warehouse will, in the preceding example, have two rows to represent
that one customer, one row for before they moved, and one for their new place of residence. This
allows the data warehouse to accurately report truly historical data because the transaction can be
thought of as staying in the location at which it occurred, instead of transferring to the newlocation.
SummaryOn-Line Transaction Processing systems have become almost ubiquitous in todays business
world, though most business users will call them by their varied commercial names. But business
managers often experience frustration with these systems because they are slow to report large
volumes of data, lack data past a few months or years, are difficult to understand for customizedreports, lack certain pieces, are simply inaccurate regarding historical changes or any
combination thereof.
The solution is a properly designed data warehouse built for speed, comprehensiveness,
completeness, and, in a word, true business analytics. In future installments on the data
warehouse theme, well dive into some of the details of how a data warehouse addresses some ofthe issues above.