what is a data mart

18
TECH TOPIC 4 P i e ' C o n e WHAT IS A DATA MART? SYS EMS, INC. by W.H. Inmon and Ed Young The world of D S has created a number of architectural structures. The most notable of the DSS structures is the data warehouse. The data warehouse is the center of the DSS universe. The data warehouse contains integrated, historical data that is common to the entire corporation. The data warehouse contains both summarized and detailed information. The data warehouse contains metadata that describes the contents and source of data that flows into the data warehouse. What is a Data Mart? Data flows from the data warehouse to various departments for their customized DSS usage. These departmental DSS data bases are c lled da a marts. A data mart is a body of DSS data for a department that has an architectural foundation of a data warehouse. Figure 1shows data marts as they emanate from the data warehouse. ~~ ~ -' / / / / / / / / / / \ \ \ , \ ,\ , \ " \ data is held at the most granular level in the current level detail. That detail is reshaped by each data mart to meet its own peculiar needs current level detail - (da ta war eho use ) Figure 1 In Figure 1 it is seen that the data residing in the data warehouse is at a very granular level and the data in the data mart is at a refined level. The different data marts contain different combinations and selections of the same detailed data found at the data warehouse. In some cases a data mart selects detailed data other than that which is found in other data marts. In other cases the data warehouse detailed data is added differently across the different data marts. Yet in other cases a data mart will structure detailed data differently from other data marts. But in every case the data warehouse provides the granular foundation for al of the data found in the all of the data marts. Because of the © 1997William H. In mon. all rights re se rved 1

Upload: sasikiran-raghupatruni

Post on 07-Apr-2018

228 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 1/18

TECH TOPIC 4

Pine' Cone WHAT IS A DATA MART?SYSTEMS, INC.

by W.H. Inmon and Ed Young

The world of DSS has created a number of architectural structures. The most notable of the DSS

structures is the data warehouse. The data warehouse is the center of the DSS universe. The data

warehouse contains integrated, historical data that is common to the entire corporation. The data

warehouse contains both summarized and detailed information. The data warehouse contains

metadata that describes the contents and source of data that flows into the data warehouse.

What is a Data Mart?

Data flows from the data warehouse to various departments for their customized DSS usage. These

departmental DSS data bases are called data marts. A data mart is a body of DSS data for a

department that has an architectural foundation of a data warehouse. Figure 1shows data marts asthey emanate from the data warehouse.

~~

~

- '/

//

/

/

/

////

\\\

, \,\,\

" \

data is held at the most

granular level in thecurrent level detail. That

detail is reshaped by

each data mart to meetits own peculiar needs

current level detail -

(data warehouse)

Figure 1

In Figure 1 it is seen that the data residing in the data warehouse is at a very granular level and the

data in the data mart is at a refined level. The different data marts contain different combinations and

selections of the same detailed data found at the data warehouse. In some cases a data mart selects

detailed data other than that which is found in other data marts. In other cases the data warehouse

detailed data is added differently across the different data marts. Yet in other cases a data mart will

structure detailed data differently from other data marts. But in every case the data warehouse

provides the granular foundation for all of the data found in the all of the data marts. Because of the

© 1997William H. Inmon. all rights reserved

1

Page 2: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 2/18

singular data warehouse foundation that all data marts have, all of the data marts have a common

heritage and are able to be reconciled at the most basic level.

Inaddition, Figure 1 shows that different departments will have their own independent data marts.

There are several names for data marts. Those names include:

• departmental DSS data bases,• OLAP data bases,• multi dimensional data bases (md dbms), and

• lightly summarized data bases, and so forth.

Different operating departments within the organization want their own DSS data marts. The

departments that typically have their own data marts include:

• finance,

• marketing,

• sales, and• accounting, etc.

Of course, any department can have its own data mart.

The Data Mart Community

The user of the data mart environment can be called the departmental DSS analyst. The departmental

DSS analyst is an individual who does decision making with a departmental bias. The departmental

DSS analyst is not a technician, but is a business person first and foremost. The decisions DSS

analysts make are mid to long term, strategic decisions.

The departmental DSS analyst community can be divided into several categories. One category is a

"farmer". A DSS farmer is someone who knows what they want and regularly and predictably goes

to the same place to find it. Another category is the DSS "explorer". A DSS explorer is an individual

who does not know what he/ she wants. The DSS explorer looks at data in a random sporadicfashion. The DSS explorer is an individual who often finds nothing, but occasionally finds

spectacular results.

Farmers and explorers are both found at the data mart level. However, many more farmers than

explorers are found here. The data mart environment has a very strong bias for farmers rather than

explorers.

The A ppeal of the Data Mart

What is the appeal of the data mart? Why do departments find it convenient to do their DSS processing

in their own data mart? What is wrong with the data warehouse as a basis for DSSprocessing?

There are several factors that lead to the popularity of the data mart. As long as the data warehouse

doesn't contain much data, then the data warehouse may serve the needs of different departments as

a basis for DSS processing. But data warehouses do not stay small very long. For a variety of reasons,

data warehouses quickly grow large. And as data warehouses grow large, the motivation for data

marts increases.

2

© 1997William H. Inmon. all rights reserved

Page 3: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 3/18

As data warehouses grow large:

• The competition to get inside the warehouse grows fierce. More and more departmental DSS

processing is done inside the data warehouse to the point that resource consumption

becomes a real problem.

• Data becomes harder to customize. In the face of a small amount of data in a data warehouse,

the DSS analyst can afford to customize and summarize data every time a DSS analysis isdone. But in the face of lots of data in a data warehouse, the DSS analyst does not have the

time and resources in order to summarize and customize the data.

• The cost of doing processing in the data warehouse increases as the volume of data increases.

• The software that is available for the access and analysis of large amounts of data (that is

typical of the data warehouse) is not nearly as elegant as the software that can process

smaller amounts of data (as is typical of the data mart).

The departmental DSS analyst discovers data marts and finds them to be very attractive. Data marts

become a natural extension of the data warehouse and are attractive for the following reasons:

• When a department has its own data mart, it can customize the data as the data flows into

the data mart from the data warehouse. There is no need for the data in the data mart tohave to serve the entire corporation; therefore, the department can summarize, sort, select,

structure, etc., their own data to their hearts content with no consideration of any other

department.

• The amount of historical data that is needed is a function of the department, not the

corporation. In almost every case, the department can select a much smaller amount of

historical data than that which is found in the data warehouse.

• The department can do whatever DSS processing they want, whenever they want, with no

consideration of the impact for resource utilization on other departments.

• The department can build the data mart on their own budget, thereby making all the

technological decisions that they want.

• The department can select software for their data mart that is very elegant and tailored to fit

their needs.

• The department can select analytical software as they wish. There is a wealth of access and

analysis software at the level of processor housing the data mart.

• The unit cost of processing and storage on the size machine that is appropriate to the data

mart is significantly less than the unit cost of processing and storage for the machine that

houses the data warehouse, and so forth.

There are many reasons as to why the data mart becomes attractive as the data warehouse grows involume. There are organizational, technological and economic reasons why the data mart is so

beguiling and is a natural outgrowth of the data warehouse.

3

© 1997William H. Inmon. all rights reserved

Page 4: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 4/18

Data Mart Source

The proper source of data for the data mart is the data warehouse. Figure 2 shows that under

normal conditions the source of data that flows into the data mart is the current level detail, or the

data warehouse.

data mart U.............--- external data

data is fed into the data martfrom either the data warehouse

or external sources

current level detail -

(data warehouse)

Figure 2

Detailed data is customized, selected and summarized as it is placed into the data mart. In addition,

the data mart can be fed data from external sources.

The programs that interface the data mart and the data warehouse become an important part of the

documentation and metaprocess infrastructure. Note that Figure 2 does not shows that data coming

from the operational or legacy environment is a legitimate source for data mart data.

Of course, the operational environment CAN be used as a basis for the building of a data mart.

However, using the operational environment as the basis for the data mart is a short term, near

sighted mistake in every case. In the long run, there are many reasons why the operational

environment is not a proper foundation for the data mart.

When the operational environment is used as a basis for the data mart environment:

• there is no integrated source of data,

• MANY interface programs must be written, maintained and executed,

• here is a massive amount of redundant data, and so forth.

Building the Data Mart First

Because the operational environment is NEVER a legitimate source for data in the data mart, there

is the implication that the data warehouse should be built before the data mart is built. Figure 3

depicts this phenomena.

4

© 1997William H. Inmon. all rights reserved

Page 5: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 5/18

data mart L J ®J the order in which the data marts

and the data warehouse shouldbe built-

1 - data warehouse2 - data mart

legacyapplications

current level detail -(data warehouse)

Figure 3

Itmay not be intuitively obvious why data marts should never be built directly from the operational

environment. For an in-depth discussion of this subject, please refer to the Pine Cone Systems' Tech

Topic #7 on "BUILDING THE DATAMART OR THE DATAWAREHOUSE FIRST?".

For many important reasons building the data warehouse first is the proper way to structure the DSS

environment.

Different Kinds of Data Marts

There are different kinds of data marts. Figure 4 shows some of the more interesting data marts.

md dbms ROLAP

data mart

there are different typesof data marts -

- md dbms- ROLAP (general

purpose)

data warehouse

Figure 4

One type of data mart is the md dbms ("multidimensional") data mart. The md dbms data mart is

one that is used for slicing and dicing numeric data in a free form fashion (i.e., free form within the

© 1997William H. Inmon. all rights reserved

5

Page 6: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 6/18

framework of the dbms that holds the multi dimensional data.). Some of the characteristics of the

md dbms are:

• sparsely populated matrices,

• numeric data, and• rigid structure of data once the data enters the md dbms framework.

Md dbms support the management capability of analytically looking at the same data in differentways.

Another type of data mart is one that can be called "ROLAp," for relational OLAP. ROLAP data

marts are general purpose data marts that contain both numeric and textual data. ROLAP data

marts serve a much wider purpose than their md dbms counterparts. Unlike md dbms, which are

supported by specialized data base management systems, ROLAP dbms are supported by

relational technology. Some of the characteristics of a ROLAP dbms are:

• numeric and textual data,

• general purpose DSS analysis,

• freely structured data,

• numerous indexes, and

• support of star schemes, etc.

ROLAP data marts can have both disciplined and ad hoc usage. Some of the processing in the

ROLAP data mart environment is very predictable. Other processing in the ROLAP environment is

very unpredictable.

The ROLAP data mart environment contains both detailed and summarized data.

Loading the Data Mart

The data mart is properly loaded from the data warehouse by means of a load program. Figure 5

shows a load program from the data warehouse to the data mart.

data mart L Jthe programmatic in terface to the data m art -

- is run period ic ally

- m ay be a total refresh

- m ay be a partia l refresh

- m ay be an appendage

- c us tom ize s d ata

- b y s ele ctin g d ata

- by summ arizing data

- b y re stru ctu rin g d ata

- by m erging data

- by ag greg atin g d ata

- p ro du ce s me ta da ta /me ta pro ce ss in fo rma tio n

feed ing d ata to th e

data m art

current level detail -(data warehouse)

Figure 5

6

© 1997William H. Inmon. all rights reserved

Page 7: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 7/18

Some of the considerations of the data mart load program are:

• the schedule of loading,

• how frequently the program is run,

• total or partial refreshment,

• is the data mart table to be refreshed in its entirety or is the data mart table to be only added

onto,

• customization of data warehouse data,• selection of data, re-sequencing of data, merging of data, aggregation of data, summarization

of data, and so forth,

• efficiency of execution - how quickly can the loading be accomplished,

• integrity of data - integrity of relationships, integrity of data domains, etc., and

• production of metadata describing the load process itself.

Metadata in the Data Mart

One of the most important components of the data mart is that of metadata. Figure 6 depicts

metadata in the data mart environment.

m~ I - - - - - - - - - - - - - - ~ ~ ~ m~ I\

d ata m art DSS an aly st

meta data is an in teg ral p art o f th ed ata m art en viro nm en t, b oth at th e -

- data m art level

- e nd u se r te rm in al

c urre nt le ve l d eta il -( da ta wa rehouse )

Figure 6

Metadata in the data mart environment serves the same purpose as metadata in the data warehouse.

Data mart metadata allows the data mart DSS analyst to find out where data is in the process of

discovery and exploration. Data mart metadata contains the following components:

• identification of the source of data,• description of the customization that has occurred as the data passes from the data

warehouse to the data mart,

• simple descriptive information about the data mart, including tables, attributes,

relationships, etc.,

• definitions, and so forth.

7

© 1997William H. Inmon. all rights reserved

Page 8: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 8/18

Data mart metadata is created by and updated from the load programs that move data into the data

marts.

There needs to be linkage between the metadata found in the data mart and the metadata found in

the data warehouse. Among other things, the metadata linkage between the two environments

needs to describe how the data in the two environments is related. This description is necessary if

there is to be drill down capability between the two environments. With the linkage, the managerusing data mart metadata can easily find the heritage of the data in the data warehouse. In

addition, the DSS analyst needs to be able to see how calculations were made and how data was

selected for the data mart environment.

The metadata found in the data mart must be available to the end user at the end user work station

in order to be effective.

Data Modeling for the Data Mart

There is an important question in the building of the data mart, and that question is - is a data

model required in order to build a data mart? Figure 7 illustrates this question.

data mart U-----~in som e cases the data m odelis applic ab le to th e build in g o fthe data m art

data model

current level detail -

(data warehouse)

Figure 7

The issue of whether a data model is required depends on the size and formality of the data mart.

Some data marts are very small and informal. For this kind of data mart, no data model is required.

Other data marts are large and formal. In those kinds of data marts it is normal to have some

amount of repetitive, predictable processing. For data marts where there is much data and whetherthere will be predictable processing, it makes sense to build a formal data model.

One of the challenges of building and using a data model in the building of the data mart

environment is that of the influence of the data mart dbms. Some data mart dbms - especially the

multi dimensional dbms - require the data to be placed in such a rigid format that it is questionable

whether a data model is of much use. The multi dimensional dbms dictates so much of the data

structure that attempting to apply a model to the dbms may be an exercise in futility.

8

© 1997William H. Inmon. all rights reserved

Page 9: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 9/18

The data mart data model follows the same conventions as a data model for any other part of theinformation processing environment with the following exceptions. The data mart data model isstrongly influenced by the department for which the model was built, and the data mart data modelfrequently incorporates detailed data AND summary data. Most data models do not have these twocharacteristics.

Purging the Data Mart

Like the data warehouse, periodically the data mart needs to be purged, as seen in Figure 8.

data mart

- purge

- archive

- consolidate

an important part of the data mart

processing infrastructure is the

program(s) that purges, archives,

or consolidates data mart data

current level detail -(data warehouse)

Figure 8

The data mart in Figure 8 is being read and some data is selectively removed. The data that isremoved may be purged, archived, or condensed. The criteria for purge can be related to date andtime, or can be based on some other criteria.

Data Mart Contents

The data mart contains whatever data that is needed for departmental DSS processing. As such,

there is a diversity of data in the data mart. Figure 9 shows that diversity.

data mart- summary - ad hoc- detailed - prepared s,u_m_m_a....::ry...-_d_et_a_ile_d--,

ad hoc lots some

prepared some lots

there are different types of data in the

data mart environment

current level detail -(data warehouse)

Figure 9

© 1997William H. Inmon. all rights reserved

9

Page 10: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 10/18

The data mart contains both summary data and detailed data. In addition the data mart contains

prepared data and ad hoc data. The matrix in Figure 9, page 9, shows the diversity of data.

As a rule, the data mart mostly has lots of ad hoc summary data and lots of prepared detailed data.

While other data can be found in the data mart, these two categories comprise the bulk of the data

found there.

Software in the Data Mart

There are many different kinds of software found in the data mart. Figure 10 shows some of the

typical software that is found there.

d ata m art E J~

so ftw are in th e d ata m art en viro nm en t -- dbm s- access/analys is- au toma ti c c reat ion- s ys tem mana gemen t- d ata u sag e track er- d ata con te nt tra ck er

- purge/arch ive- m e ta da ta mana gemen t

th ere a re d iffe re nt k in ds o f s oftw are in th edata ma rt e nv ironmen t

c urre nt le ve l d eta il -( da ta wa rehouse)

Figure 10

The software found in the data mart includes:

• dbms software,

• access and analysis software,

• automated data mart creation,

• purge / archival software,

• metadata management software, and so forth.

Easily the most interesting of the software is access and analysis software. Access and analysis

software allows the end user to search through the data mart to find and/ or calculate the data the

end user desires. The access and analysis software is distinguished by its elegance of presentation.

The two types of dbms that are found in the data mart are relational and multi dimensional, as seen

in Figure 11.

10

© 1997William H. Inmon. all rights reserved

Page 11: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 11/18

bms for data marts -- md dbms

- relational

data art

there are two basic types of dbms softwarefor the data warehouse environment

current level detail -

(data warehouse)

Figure 11

Multi dimensional technology is at the heart of numeric slice and dice processing. Relational

technology forms the basis for all other standard data mart processing.

The kinds of tables and the numbers of tables that are found in the data mart are many and varied.

Figure 12 shows that over time the numbers of and the types of tables grow.

data mart

-

over time the number of tablesin the data mart and the types oftables in the data mart growurrent level detail -

(data warehouse)

day 1

day n

Figure 12

The types of tables found in the data mart include:

• summary tables,

• detailed tables,

• reference tables,

• historical tables,• ad hoc created tables, and

• spread sheet analytical tables, and so forth.

As time passes, the number of tables and the number of types of tables grow. In addition, the data that

comes from external sources grows as well. The result is a backlog ofDSS data that serves as a "library"

The library is available for management when they want a quick reply or a survey of available data.

11

© 1997William H. Inmon. all rights reserved

Page 12: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 12/18

Cross Reference with Other Data Marts

One of the most useful things that the dwa can do is to cross reference data mart data across all data

marts. Figure 13 shows the cross referencing.

among the different data marts

there needs to be synchronization

cross referencing, and integrity of

data and processing

current level detail -(data warehouse)

Figure 13

In Figure 13, the dwa has created a cross reference of data across all data marts. The cross reference

tells such things as:

• where there is common or related data among the data marts,

• where there is a common source of data,

• where there are differences of note across the data, and so forth.

Where there is common data, the dwa needs to make sure the nuances and distinctions that separate

the data are obvious. When one data mart treats a calculation one way and another data mart makes

almost the same, but not quite the same calculation, the dwa needs to be able to document the

differences.

Metadata in the data mart environment is the most effective place to accomplish the cross referencing

of data inthe data mart environment.

Difference in Development Life Cycles

The data mart attracts farmers while the current level of detail attracts explorers. With the different

communities that are attracted, there comes a difference in development life cycles. Figure 14 shows

these differences.

12

© 1997William H. Inmon. all rights reserved

Page 13: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 13/18

~ data mart SDLe

~ I~

there is a different audience and

a different development life cyclefor the data mart and the data

warehouse

CLDS

current level detail -(data warehouse)

Figure 14Farmers operate from a classical SDLC development life cycle. Explorers operate from a life cycle

that can be described as a CLDS (i.e., a development life cycle that is the reverse of the SDLC.) Of

course, when ad hoc processing is being done in the data mart, development may well proceed along

the lines of CLDS. And for that reason the data mart holds a diversity of data. But for the

predictable, repetitive portion of the data warehouse the classical SDLC is the norm.

Structure Within the Data Mart

The data is structured in the data mart along the lines of star joins and normalized tables. The farmer

oriented tables that are produced for the data mart center around star joins. Star joins are created

where there is a predictable pattern of usage and where there is a significant amount of data.Relational tables are used as a basis for design where there is not a predictable pattern of usage.

Figure 15 shows the two predominant patterns of design that are found in the data mart.

-. ~A ..... _~U~- """0

data mart

since the data mart contains both

predictable and unpredictable data,

both star joins and normalized data

as defined from a data model apply

to the design of a data mart

current level detail -(data warehouse)

Figure 15

Both forms of structure are able to reside in the data mart with no conflict of interest.

© 1997William H. Inmon. all rights reserved

13

Page 14: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 14/18

External Data in the Data Mart

The use of external data holds great promise for inclusion in the data mart. Figure 16 shows that

external data has a place in the data mart.

external

d~~~

data mart

external data belongs in the data mart

under the following conditions -

- that if the external data is used any

where else in the DSS environment

that the external data is placed in the

data warehouse first

- that the external data is available

to anyone who wants to use it

- the external data can be modified

as it enters the data mart

current level detail -

(data warehouse)

Figure 16

There are several important issues that relate to the usage of external data in the data mart. The first

issue is that of redundancy. When the external data needs to be used in more than one data mart, it

is best to place the external data inside the data warehouse, then move the data to the data mart. This

practice assures that redundancy can be controlled. In addition this practice ensures that the external

data needs to be purchased only once. When each data mart is "doing their own thing" in the

purchase and acquisition of external data there is always the chance that two or more data marts will

be purchasing the same external data.

A second issue is that of storage of secondary data along with the external data. When external datais acquired, the "pedigree" of the data needs to be stored as well as the data. The pedigree of external

data includes:

• the source of the external data,

• the date the external data was acquired,

• how much external data was acquired,

• data descriptions of the external data, and

• editing and filtering criteria that has been applied to the external data, and so forth.

Reference Tables

Reference tables play an important role in the data mart environment. Data marts allow the data

mart end user to relate data back to the expanded version of the data. In doing so, the end user canoperate in data "short hand" if desired. Figure 17 shows the use of reference data in the data mart

environment.

14

© 1997William H. Inmon. all rights reserved

Page 15: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 15/18

referencetables II

J J data mart referen ce tab les are a n orm al p arto f th e d ata mart en viro nmen t.

R efe ren ce ta bles c an b e c op ie do ver from th e d ata w arehou seenvironment

,

II

current level detail -

(data warehouse)

Figure 17

The reference data typically is copied over from the data warehouse. Although it can happen, itis

very unusual for the data mart to store and manage its own reference tables. When reference tables

are stored and managed in the DSS environment, there is a need to manage their contents over time.

Time management of reference data is a complex task, and is best managed from the data warehouse.

Performance in the Data Mart

The issue of performance in the DSS environment is an entirely different issue than in the OLTP

environment. In the DSS environment response time requirements are relaxed. One minute up to 24

hours is the expectation for performance in the DSS environment. The issue of performance isespecially relaxed for the data warehouse, where there is an abundance of data and where there is a

lot of exploration occurring.

Performance in the data mart is somewhat different from the data warehouse for two reasons - the

data mart houses mostly farmers, and there is much less data in the data mart environment than in

the data warehouse environment. Where the designer is working with farmers, there can be an

anticipation of requirements. And where there can be an anticipation of requirements, reasonable

performance objectives can be attained. The star join is one way in which the needs of farmers in the

data mart environment can be accommodated.

Figure 18 illustrates the need for performance in the data mart environment.

© 1997William H. Inmon. all rights reserved

15

Page 16: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 16/18

Figure 18

When the data mart is md dbms, then there is a different expectation for performance. As long as the

analysis is being done within the boundaries of that which has been defined to the md dbms, and as

long as the md dbms has not been overburdened with too much data, then there is the expectation of

good performance.

data mart ~CO~~=I

the expectations for performance are highin the data mart environment. Given the

small amounts of data and the level ofuser, it is easy to see why the end userexpects good performance

current level detail -(data warehouse)

Performance is achieved in several ways in the data mart environment, such as:

• making extensive use of indexes,

• using star joins,• limiting the volume of data that is found in the data mart,

• creating arrays of data,

• creating profile records (i.e., aggregate records), and

• creating pre-joined tables, etc.

Monitoring the Data Mart Environment

Like the data warehouse environment, the data mart environment requires periodic monitoring.

Figure 19 shows this requirement.

16

activ.ity D .....monitor f----- ..-~U~ LJ

data

monitor

data mart j

monitoring the data mart environment

is mandatory when the data mart

grows to a large enough size

. . . . . . . . ./

current level detail -(data warehouse)

Figure 19

© 1997William H. Inmon, all rights reserved

Page 17: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 17/18

The two types of monitoring that are required are data usage tracking and data content tracking. The

data usage tracker looks at the types of requests the DSS analyst is submitting. The data usage tracker

outlines such things as:

• what data is being accessed,

• what users are active against the data mart,

• what response time is being achieved,

• how much data is being requested, and• what are the busiest times of the day? of the week? of the month? and so forth.

The data content tracker looks at other aspects of the data mart. The data content tracker addresses

such things as:

• what are the actual contents of the data mart,

• is there bad data in the data mart,

• how much is the data mart growing, and

• what is the most effective way to access data in the data mart, and so forth.

The monitors for the data mart grow in importance in relation to the volume of data in the data mart

and the amount of query activity that passes through the mart. When the data mart is new and small

there may be no need for monitoring the environment. But when the data mart grows in volume of

data and/ or in usage of the data, then a monitor facility makes sense.

Security in the Data Mart Environment

The data mart needs to be secure, just as other components of the DSS environment need security.

Figure 20 shows this need.

for some data marts, security

is a large issue. For other

data marts, security is no

problem

security

data mart

<, ../

current level detail -(data warehouse)

Figure 20

When the data mart contains sensitive information, there is a need to secure that information. Typical

sensitive information might include:• financial information,

• medical record information, and

• human resource information, etc.

The data mart administrator needs to ask - what are the consequences of the information contained

in the data mart being misused? If there are no consequences, then there is no need for security. If,

however, there is a downside for the misuse of information then the data mart needs security.

17

© 1997William H. Inmon. all rights reserved

Page 18: What is a Data Mart

8/6/2019 What is a Data Mart

http://slidepdf.com/reader/full/what-is-a-data-mart 18/18

There are many different forms of security. The level of security that is needed is dependent on the

sensitivity of the data. If the security exposure is not large, then a casual approach to security may

suffice. If the security exposure is large, then a much more rigorous approach to security will be

required. Some of the approaches to security in the data mart environment include:

• firewalls,• logon/logoff security,

• application based security,• dbms security (i.e., VIEW based security), and

• encryption/ decryption.

As a rule, the greater the degree of security required, the more expensive and complex the

infrastructure that will be required.

Summary

The data mart is a powerful and natural extension of the data warehouse. The data mart extends DSS

to the departmental environment. The data warehouse provides the granular data and the different

data marts interpret and structure that granular data to suit their needs.

The appropriate source for the data mart is the data warehouse. Under no circumstances is the

operational environment an appropriate source for the data mart. The data mart can contain external

data.

There are different types of data marts - md dbms data marts and ROLAP data marts. Each of the

types of data mart plays a different role. Metadata is an integral part of the data mart environment.

When the data mart contains data that is repetitively and predictably accessed, then a data model

makes sense. But if a data mart is small and contains ad hoc data then a data model does not make

sense.

The software found at the data mart includes:

• dbms,• access and analysis,

• automatic interface generation,

• system management,

• purge / archival, and

• metadata management, etc.

Metadata is an integral part of the data mart environment. Among other things, metadata allows the

different data marts to achieve a degree of cohesiveness. In addition, the metadata allows the end

user to efficiently access data in the data mart.

The data structures found in the data mart include star joins and normalized data, emanating from

the data model.