data mart . in some data warehouse implementations, a data mart is a miniature data warehouse; in...

42
DATA MART www.notesvillage.com

Upload: philippa-carson

Post on 27-Dec-2015

227 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

DATA MART

www.notesvillage.com

Page 2: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

• In some data warehouse implementations, a data mart is a miniature data warehouse;

• In others, it is just one segment of the data warehouse.

• Data marts are often used to provide information to functional segments of the organization.

www.notesvillage.com

Page 3: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

When Data Mart Appropiate• Data marts are sometimes designed as complete individual data warehouses and

contribute to the overall organization as a member of a distributed data warehouse.

• In other designs, data marts receive data from a master data warehouse through periodic updates, in which case the data mart functionality is often limited to presentation services for clients.

• Data Marts are created for the following reasons– To speed up work by reducing the volume of data scanned– To structure data for a user access tool– To partition data in order to impose access control strategies– To segment data into different hardware platform

www.notesvillage.com

Page 4: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

DESIGN OF DATA MART

• Regardless of the functionality provided by data marts, they must be designed as components of the master data warehouse so that data organization, format, and schemas are consistent throughout the data warehouse.

• Inconsistent table designs, update mechanisms, or dimension hierarchies can prevent data from being reused throughout the data warehouse, and they can result in inconsistent reports from the same data

• Example:– it is unlikely that summary reports produced from a finance department data mart that

organizes the sales force by management reporting structure will agree with summary reports produced from a sales department data mart that organizes the same sales force by geographical region.

Before designing for data mart we must confirm that data mart solution is appropiate for the solution

– Identify whether there is a natural functional split within the organization– Identify whether there is a natural split of data– Data marts should be designed from the perspective that they are components of the data

warehouse regardless of their individual functionality or construction– This provides consistency and usability of information throughout the organization.

www.notesvillage.com

Page 5: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

IDENTIFY FUNCTIONAL SPLIT• We must see if the split will help the organisational benefit or not• Example

– athe retail sales in a organisation in which merchant is responsible for sales.Their berief could be to maximize the sales by ensuring adequate sales.

• In practice the information would be of value of:– Sales transaction on a daily level or to monitor actual sales– Sales forecast on weekly basis– Stock position daily basis– Stock movement on a daily basis .

www.notesvillage.com

Page 6: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Importance of Data Mart

• Easy access to frequently needed data • Creates collective view by a group of users • Improves end-user response time • Ease of creation • Lower cost than implementing a full Data

warehouse • Potential users are more clearly defined than

in a full Data warehouse

www.notesvillage.com

Page 7: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

META DATA• Metadata is loosely defined as data about data.• Metadata is a concept that applies mainly to electronically archived or presented

data and is used to describe the – a) definition, – b) structure and – c) administration of data files with all contents in context to ease the use of

the captured and archived data for further use. – example: a web page may include metadata specifying what language it's

written in, what tools were used to create it, where to go for more on the subject and so on

www.notesvillage.com

Page 8: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

What is Meta data• Metadata (meta data, or sometimes metainformation) is "data about other data", of any

sort in any media. An item of metadata may describe an individual datum, or content item, or a collection of data including multiple content items and hierarchical levels, such as a database schema. In data processing, metadata provides information about, or documentation of, other data managed within an application or environment. This

commonly defines the structure or schema of the primary data.– metadata would document data about data elements or attributes, (name, size, data type, etc) and data about

records or data structures (length, fields, columns, etc) and data about data (where it is located, how it is associated, ownership, etc.). Metadata may include descriptive information about the context, quality and condition, or characteristics of the data. It may be recorded with high or low granularity

Definition:Metadata contains information about that data or other data

Metadata is structured, encoded data that describe characteristics of information-bearing entities to aid in the identification, discovery, assessment, and management of the described entities

www.notesvillage.com

Page 9: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Why Metadata is important• Assume that the project team has completed successfully the development of first

data mart.But the user can have several things in mind:– Are the predefined queries I look at– What are the various elements in data warehouse– Is there information about unit sales and unit costs by product– How can I browse and see what is available – From where did they get the data for data warehouse? From which source

system– How old are data warehouse– When is the last time fresh data was brought in– Are there summaries by months and product

www.notesvillage.com

Page 10: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

• We can define meta data in terms data warehousing which includes:– Data about data– Table of content for data– Catalog for data– Data warehouse roadmap– Data warehouse directory

www.notesvillage.com

Page 11: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Applications of Metadata• Libraries• Metadata has been used in various forms as a means of cataloging archived

information.• Photographs• Metadata may be written into a digital photo file that will identify who owns it,

copyright & contact information, what camera created the file, along with exposure information and descriptive information such as keywords about the photo, making the file searchable on the computer and/or the Internet

• Web pages• Web pages often include metadata in the form of meta tags. Description and

keywords meta tags are commonly used to describe the Web page's content. Most search engines use this data when adding pages to their search index.

www.notesvillage.com

Page 12: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Critical Need of Data warehouse• Meta data is absolute need in building datawarehouse i.e

– For Using data warehouse:• To run adhoc queries and formatting reports users need to know

about the data in data warehouse.• The users should gain maximum from data ware house and

ignorance of data should not give them wrong conclusion– For building the data warehouse:

• For data extraction we must know the source system• Structures and content will help in determining mapping • As a Role of DBA if one needs to know about metadata for physical loading and

staging.

– Data Administration• Data Administration is not possible knowing the metadata• Metadata is absoultely necessary for building datawarehouse

www.notesvillage.com

Page 13: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Data warehouse Metadata

• Metadata systems in data warehouse are sometimes separated into two sections:1.back room metadata that are used for Extract,

transform, load functions to get OLTP data into a data warehouse

2.front room metadata that are used to label screens and create reports

www.notesvillage.com

Page 14: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Business Intelligence metadata

• Business Intelligence is the process of analyzing large amounts of corporate data, usually stored in large databases such as a Data Warehouse, tracking business performance, detecting patterns and trends, and helping enterprise business users make better decisions. Business Intelligence metadata describes how data is queried, filtered, analyzed, and displayed in Business Intelligence software tools,

such as Reporting tools, OLAP tools, Data Mining tools.– Examples:

• Data Mining metadata: The descriptions and structures of Data Sets, Algorithms, Queries • OLAP metadata: The descriptions and structures of Dimensions, Cubes, Measures (Metrics),

Hierarchies, Levels, Drill Paths • Reporting metadata: The descriptions and structures of Reports, Charts, Queries, Data Sets,

Filters, Variables, Expressions

www.notesvillage.com

Page 15: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Building the data warehouse

• To build the metadata when need the data for data warehouse extracted,the programmer needs to know – the source system,data structure– The data content– How to handle data

• For DBA – Incremental loading– Last Compared data– Populating tables

www.notesvillage.com

Page 16: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Administrating of Data warehouse

• Add new summary table• Expand storage• Add information delivery to the users• When to schedule back ups• How o maintain security system• How to keep data definition up to date• How o verify external data ongoing basis

www.notesvillage.com

Page 17: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Metadata used for Transformation and Load

• Metadata may be used during data transformation and load to describe data any changes made to data.

• The greater the difference in source the greater the requirement of metadata.• The advantages of storing metadata is any transformation takes place as source data

changes it can be captured by metadata.• For source data the following information required

– Source field(needs to be uniquely identified• Unique Identifier• Name• Type• Location

– System– Object

www.notesvillage.com

Page 18: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Data management• Meta data is required to describe the data as it resides in the data warehouse.• This is needed for warehouse manager to track and control all data movement.• Metadata is needed for all these things

– Tables• Columns• Name• Type

– Indexes• Columns

– Name– Type

– Views• Columns

– Name– Type

– Constraints(name,type,tables

www.notesvillage.com

Page 19: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Data management• For each table the information stored are:

– Table name(should be name in data dictionary– Columns

• Column name• Reference identifier

• Aggregation to be stored in the way table is stored with aggregation name and columns .

• Similarly partition also need information like partition key and data range inside the table

www.notesvillage.com

Page 20: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Data E T L• How to handle data changes• How to include new sources• Where to cleanse the data• How to change data cleansing

method• How to switch to new data

transformation technique• How to add new external data

source• How to drop external data source• How merging and acquisition takes

place

Data Warehouse• How to add new summary table• How to expand storage• How to add new information tools for

users• How to continue ongoing training• How to improve adhoc queries• When to schedule back ups• How to maintain security systems• How to monitor load distribution

www.notesvillage.com

Page 21: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Why Metadata for vital end users• Meta data helps user to know the complexity of data and how it should be

transformed into the information.• In a company when a business analyst analyses the reason for loss or profit ,he

sees the following things:• Are the sales stored in individual transactions or summary totals.• Can sales be analyzed by product , promotion ,store and month.• Can the current month sales be compared to previous month sales• From where the sales come from , what is the source system.• How old are sales system and how does it get updated.

– If the analyst is not sure of data he can not anlayze perfectly.– It would be perfect for a anlyst if he has a perfect road map of

metadata.

www.notesvillage.com

Page 22: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Metadata Vital for End users

• Data Content• Summary Data• Business Dimensions• Business metrics• Navigation paths• Source systems• External data• Last update data• Report formats• OLAP data

www.notesvillage.com

Page 23: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Who needs MetadataIT Professionals POWER USERS CASUAL USERS

Information discovery

Database,Tables,columns ,server

Databases,tables,columns

Queries ,reports

Meaning of Data Data structures ,data definationsCleanising functions

Cleansing functionsTransformations rules

Data owners,filters

Information Access SQL,3GL,4GL, Query tools Authorization requests,Information retreival

www.notesvillage.com

Page 24: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Query Generation

• Meta data is required by the query manager to enable generate queries.

• The query manager generate metadata about the queries it has run

• The metadata can be used build a history of all queries run and generate query profile.

www.notesvillage.com

Page 25: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Query Generation• The metadata that is required for each query are:

– Query • Tables accessed

– Columns accessed » Name» Reference identifier

• Restriction applied– Column name– Table name– Reference identifier– Restriction

• Join criteria applied– Column name– Table name– Reference identifier

www.notesvillage.com

Page 26: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Why Metadata is essential for IT• Beginning from data extraction to information delivery metadata is crucial.• The following is the need for IT to process data:

– Source of data structures– Source platforms– Data extraction methods– External data– Data transformation rules– Data cleansing rules– Staging area structures– Dimensional models– OLAP Sytems– Query/report Design

www.notesvillage.com

Page 27: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Automation of datawarehouse tasks

• Tools performs major functions of data warehouse• Tools enables data movement ,transformation accordingly etc.• While designing data warehouse we must at the beginning see to create tool for

metadata.• In backend processes each tools record it’s own metadata.

– Source data structure definition– Data extraction– Initial Reformatting/merging– Preliminary data cleansing– Data transformation – Validation – Data warehouse structure definition– Load Merge creation

www.notesvillage.com

Page 28: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Classification of Metadata types

• Classification of metadata types by functional areas:– Data acquisition– Data storage– Information delivery

www.notesvillage.com

Page 29: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

• Acquisition process:– Data Extraction– Data transformation– Data cleansing– Data Integration– Data staging

• Metadata Types:– Source system platforms– Source structure definition– Data extraction method– Data transformation rules– Data cleansing rules– External data sructures– External data definition– Summerization rules– Target physical and logical

models

www.notesvillage.com

Page 30: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Data Storage

• The metadata used recorded by the process in data storage area is used for development ,administration and for user.

• User would like to see what is the last time previous data loaded.

• DBA will use the metadata for processes backup and incremental loads.

www.notesvillage.com

Page 31: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Information Delivery

• Information delivery– Report generation– Query processing– Complex Analysis

• Metadata types:– Source systems– Source data definitions– Data extraction tools– Query templates– Preformatted reports– OLAP content

www.notesvillage.com

Page 32: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Technical Metadata

• Technical Metadata:– data about the processes, the tool sets, the

repositories, the physical layers of data under the covers. Data about run-times, performance averages, table structures, indexes, constraints; data about relationships, sources and targets, up-time, system failure ratios, system resource utilization ratios, performance numbers

www.notesvillage.com

Page 33: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Technical Metadata

• List of questions Technical Metadat can answer– What database and tables exists– What are column for each table– What are keys and indexes– What are physical files– What load refresh schedules– What type aggregations are available– What is source to target mapping in data warehouse.

www.notesvillage.com

Page 34: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Business Metadata

• Better understand metadata by looking at a list of example:– Source systems – Source to target mapping– Data transformation business rules– Data transformation– Attributes and business definition– Query reporting tools– Predefined tools– Predefine reports– Report distribution information– Currency OLAP Report– Rules for analysis using OLAP report

www.notesvillage.com

Page 35: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Behaviour of Business Metadata

• How can I sign onto Metadata• Which part of data warehouse I can access.• What are part of definition I need on my part for query.• What are types of aggregation available for my metrics.• How Old are OLAP data. Should I wait for next update.• Benificaries:

– Managers– Business analyst– Regular users

www.notesvillage.com

Page 36: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Business MetaData:In IT, Business Metadata is adding additional text or statement around a particular word that adds value to data. Business Metadata is about creating definitions, business rules. For example, when tables and columns are created the following business metadata would be more useful for generating reports to functional and technical team. The advantage is of this business metadata is whether they are technical or non-technical, everybody would understand what is going on within the organization. Table’s Metadata: While creating a table, metadata for definition of a table, source system name, source entity names, business rules to transform the source table, and the usage of the table in reports should be added in order to make them available for taking metadata reports.Column’s Metadata: Similarly for columns, source column name (mapping), business rules to transform the source column name, and the usage of the column in reports should be added for taking metadata reports.

www.notesvillage.com

Page 37: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Business rules In dataware house• In the course of designing and populating a data warehouse, some key questions must be

answered about the data being incorporated in the warehouse. More often than not, many of these answers are not known at the outset of the project, but must be established if the data warehouse is to succeed. Interestingly, these for the most part represent the same contextual information about the data that business users of the warehouse will need to know to be able to fully understand the information provided, and to trust in its reliability. The questions include:

• What are the valid values for the attributes of the data warehouse? • What are the valid data sources for the data warehouse? • When the data’s life cycle, in the operational world, should it be captured and sent to the

data warehouse? • What are the “cleansing rules” for the source data? • What are the transformation rules to move the source data to the target database? • How was the data calculated in the operational database

www.notesvillage.com

Page 38: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Example of business metadata

www.notesvillage.com

Page 39: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Difference between Technical metadata and business metadata

• Metadata into technical (the tool-specific metadata used by IT and vendors) and business metadata (what a businessperson needs to know about what data represents).

• The technology person thinks about a data column - how it's defined in a database, represented in a data model, mapped and transformed in the ETL tool and defined in the BI report. All of this, however, is very much related to how the tools store and process the data. The primary challenge is gathering and integrating the metadata across tools.

• The businessperson thinks about where the data came from, its associated data quality level, how it was filtered from its source and what types of business rules and algorithms were applied to it. Most of this metadata is either not stored in the tools or needs some serious translation from technical terms to business language.

www.notesvillage.com

Page 40: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

METADATA MANAGEMENT • The requirements for Metadata management are:

– Capturing and storing business• Changes of algorithm methodology occurs when data for several years stores.• Versioning must be maintained

– Variety of Metadata sources• Different sources metadata available

– Metadata integration• To be unified,merge to give a meaning to the end user.

– Metadata standardization• Storage all the metadata should be in the same manner

– Rippling through revisions• Revisions will occur as business rules changes

– Metadata Exchange• End user should be able to exchange one meta data to another meta data.

– Support for end user• Meat data must provide simple graphical and tabular representation to make-it

easy to browse through.

www.notesvillage.com

Page 41: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

Challenges

• Major challenges for Metadata management are:– Each software tool has it’s own propiriey of metadata.If

we are using several tools ,how can we reconcile it.– No industry wide accepted standards exist for metadata

formats– Preserving metadata version control uniformity in data

warehouse is very much difficult.– Unifying data sources are very much difficult , since we

have to deal with conflicting standards, formats , data naming conventions , units and measures.

www.notesvillage.com

Page 42: DATA MART . In some data warehouse implementations, a data mart is a miniature data warehouse; In others, it is just one segment of

META DATA REPOSITORY

• Metadata repository may be thought of two distinct information queries:– Technical Metadata– Business Metadata

www.notesvillage.com