· • oracle database 11g new features for data warehouse and business intelligence • wrap-up....

53

Upload: others

Post on 08-Jul-2020

2 views

Category:

Documents


0 download

TRANSCRIPT

<Insert Picture Here>

Building Scalable and High-Performance Data WarehouseJungok NahPrincipal Consultant

Agenda

• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse

and Business Intelligence• Wrap-up

Agenda

• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse

and Business Intelligence• Wrap-up

Trends of the Information System by period

1995 1996 1997 1998 1999 2000

2001 2002 2003 2004 2005 2006 / 2007

정보계구축

---> Data Warehouse 구축

콜센터

구축

영업지원

시스템

구축

eCRM

구축

고객

분석모델

개발

고객

세그멘테이션

고객

LTV모델

개발

DW 리모델링 / 고도화 / 재구축

통합

CRM 구축

분석/운영 CRM 연계

현업에서의

활용성보다,DW구축하기위한

프로젝트에

집중

고객

접점

채널별,최적화된

채널구축에집중

체계화된

데이터

구조를기반으로, 현업의

활용성을극대화한

DW체계를갖추기

위한

재구축

기개발된

고객접점채널을

통합하고, 이를

분석CRM과연동하는데

초점

R

T

E

데이터

축적

분석위주

E D W

인프라

실시간

구분 EIS/DSS BI/DW Next Generation

Main Focus Insight for small Single version of truth Single version of truth across trading partners

Audience Senior management Internal stakeholder All stakeholder

Stated Business Driver Drowning in paper Drowning in data Data is too latent for the decision- making process

Typical Data Volume gigabyte terabyte terabyte,exabyteData Organization Simple Star schema or MMDB Hybrid relational/Object

Data Quality Surfaced through EIS&DSS Baked in at Start of effort Across stakeholder and business unit mandatory

Data Quality Approach Passive Reactive Proactive

Early Signal Requiring Senior management buried in paper

Stovepipe or Island of BI applications

Disjointed portal, BPM,BAM, Predictive modeling, BI/DW applications &Operational systems

Data Refresh Rate Monthly Daily Near Real Time

Data Provisioning Simple scripted process E T LMerged ETL/EAI/EII/BPM/

Change Data Capture

Business Intelligence Reporting/OLAP Scorecard/DashboardPredictive analysis/Alert &

NotificationEducational Vehicle Specialized development team Center of excellence To be evolved

* Next Generation System (TDWI 2005. 10 )

Staging 5Active

Staging 4Operational

Staging 3Predictive

Staging 2Analytical

Staging 1Reporting

BATCH RIGHT-TIME NEAR-REAL-TIME REAL-TIMEWEEKS DAYS HOURS MINUTES SECONDS SUB-SECONDS

INACTIVE REACTIVE PROACTIVE AUTONOMOUSWHAT

HAPPENED ?WHAT DO ITHAPPEN ?

WHAT WILLHAPPEN ?

WHAT ISHAPPENING ?

WHAT DO I WANTTO HAPPEN ?

* DW Maturity Step

The Next Generation Information System

Source: TDWI

Source: TDWI

*Data Integration / *Consolidation / *Federation(고객 / 채널 ) + 통합방법 + 표준화체계*Data Integration / *Consolidation / *Federation(고객 / 채널 ) + 통합방법 + 표준화체계

데이터 통합 측면데이터 통합 측면

Data Warehousing / Architecture / Real-Time DWData Warehousing / Architecture / Real-Time DW정보제공 인프라 측면정보제공 인프라 측면

OLAP Architecture / Portal, Dashboard / 시스템연계OLAP Architecture / Portal, Dashboard / 시스템연계정보활용 측면정보활용 측면

C P M ( Corporate Performance Management )C P M ( Corporate Performance Management )성과분석 측면성과분석 측면

11

22

33

44

DW Maturity Model & ROI

Require Single View with Enterprise- wide integrated Data

휴대폰

자택전화 Fax

직장전화 Fax

전자우편 수신거부

자택주소

직장주소

채널경로 우수고객 발신자번호

고객명주민/사업자번호

찾기찾기지움지움

고객등급

계약건수

개인고객구분

직원여부고객만족도

고객사항 개인고객개인고객 법인고객법인고객 세대정보세대정보 가망고객가망고객 고객의소리고객의소리 e-서비스e-서비스 팩스발송팩스발송 메일발송메일발송

고객구분 고객분석

연락처 주소

계약사항 캠페인

접촉이력 지원/상담내역

접촉이력

계약/계좌

서비스

캠페인

소재지접촉

포인트

고객

정보

관계정보분석

정보

통합된데이터

고객

고객 고객

고객

Account

고객

영역

(기간업무,홈페이지…)

계약

정보(기간업무)

접촉이력(콜센터,홈페이지… )

캠페인관리

시스템

영업지원

시스템

분석영역(EDW,DM…)

휴대폰자택전

화Fax

직장전

화Fax

전자우

수신거

자택주

소직장주

채널경

로우수고

객발신자번

고객명주민/사업자

번호

찾기찾기지움지움고객등

급계약건

수개인고객구

직원여

부 고객만

족도

고객사

개인고객개인고객 법인고객법인고객 세대정보세대정보 가망고객가망고객 고객의소리고객의소리e-서비스e-서비스 팩스발송팩스발송 메일발송메일발송

고객구분 고객분석

연락처 주소

계약사항 캠페인

접촉이력 지원/상담내역

접촉이력

계약

서비스

캠페인

소재지접촉

포인트

고객정보관계정보

분석

정보

고객

+ 계좌(상품) + 분석

+ 접촉

+ 상담

+ 서비스

관계

연락처

고객

접촉이력 상담내역

고객

거래내역

반응정보

캠페인

통계정보

고객분석정보

영업서비스

서비스요청

실시간실시간 + + 배치배치 실시간실시간 + + 배치배치

데이터데이터

통합통합

=> => Data Integration ? Consolidation? Federation?Data Integration ? Consolidation? Federation?

Agenda

• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse

and Business Intelligence• Wrap-up

Comprehensive Database Platform

• For business intelligence and data warehousing• Industry-leading performance and scalability• Deep integrated analytics• Embedded ELT and data quality

• All running on a low-cost enterprise grid• Reliable and secure• Easy to manage

Data Warehouse Product Portfolio

Industry-leading performance and scalabilityOracle DatabaseOracle PartitioningOracle Real Application Clusters

Deep integrated analytics Oracle OLAPOracle Data Mining

Embedded data quality and integrationOracle Warehouse Builder (OWB)OWB Enterprise ETLOWB Data QualityOWB Connectors

Data Warehousing Strategy

• Best database for BI/DW solutions• No other database can compare on the breadth and sophistication

of Oracle’s database features

• Complete solution portfolio• Complete database platform including ELT and Analytics• Oracle BI and Performance Management solutions• Broadest array of third-party technologies and solutions

• On the right hardware infrastructure• Offer customers a choice of solutions tailored to meet needs

Gateway

Mainframe Unix

[ 데이터 설계 ]Oracle Designer

orOther Case Tool

[ ETL관리 ]: Oracle Warehouse Builder[ DB 관리 ]: Oracle Enterprise Manager

[ 데이터 설계 ]Oracle Designer

orOther Case Tool

[ 데이터 변환 ]Merge(Upsert)

Multitable InsertTable Function

Parallel DDL/DML

[ 구축/운영

]3rd Normal Form

Multi-Dimensional ModelMaterialized View

Partition tableParallel ProcessingAnalytical Function

Compress

[ 데이터 파티셔닝 ]Partition table

Partition Exchange LoadingCompress

Staging/ODS

TargetDW

TransformationERP

System

[데이터추출 및 적재]SQL*Loader

Transportable TablespaceExternal table

Change Data Capture Streams

Materialized ViewOracle Gateway

Oracle DBMS DW Function Architecture

Key Features for VLDB

• 다양한

분할기법• 디스크

장애

해당

Partition만 영향

• 개별

Partition 단위

의 관리 기능(DML, Load, Import, Export, Exchange 등) 지원

• 사용자

질의시

해당

되는

파티션만

Access

• 로드

밸런싱

• 7*24 작업가능

파티션테이블/인덱스

파티션테이블/인덱스

• 블록내의

중복제거• 질의성능

향상• 스토리지

절약• 테이블

테이블스

페이스, 파티션레벨

에서

압축

지정가능• Materialized View도

압축

가능

데이터압축

데이터압축

• 질의성능향상

제를

위한

기능• 자주

쓰이는

유형의

데이터

Set을 미리

수행 한 후 테이블에

저장

• Query Rewrite기능• 주기적인

요약정보

갱신

가능

요약View

요약View

• Parallel Query• Parallel Load (with

SQL*Loader Utility)• Parallel DDL, DML• Parallel Recovery• Parallel Analyze• Parallel Read from

External table• Parallel Propagation

(Replication)• Parallel-wise Join• Parallel Server (with

RAC)

병렬프로세싱

병렬프로세싱

• 변경데이터를

정보

분석요구주기에

곧바로

적용할

있는

다양한

CDC(Change Data Capture) 기능

제공

• SAM File을 곧바로

DBMS의

테이블처럼

인식하여

SQL을 적

용할 수 있는 기능

제공

• 소스시스템의

상황

에 따라 적용할 수

있는

다양한

ETL 기

능 제공

데이터공유

데이터공유

대용량

데이터의

처리

#1 for Data Warehousing

Source: IDC, 2007 – Data Warehouse Platform Tools 2006 Vendor Shares

Multi-Terabyte Data WarehousesCPG / Transportation

Retail Communications Financial Services Energy Manufacturing

Oracle Information Appliance Initiative

Goals for Oracle data warehouse solutions:

• Provide superior system performance• Provide a superior customer experience

Choice of Configurations

• Database Options

• Management Packs

FoundationFoundation• Documented best-practice

configurations for data warehousing

• Benefits:High performanceSimple to scale;

modular building blocksIndustry-leading

database and hardwareAvailable today with HP,

IBM, Sun, EMC/Dell

• Flexibility for the most demanding data warehouse

• Benefits:High performanceUnlimited scalabilityCompletely

customizableIndustry-leading

database and hardware

CustomCustom

• Database Options

• Management Packs

• Partitioning• RAC

ApplianceAppliance• Scalable systems pre-

installed and pre- configured: ready to run out-of-the-box

• Benefits:High performanceSimple to buyFast to implementEasy to maintainCompetitively priced

Flexibility

Pre-configured, Pre-installed, Validated

Agenda

• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data

Warehouse and Business Intelligence• Wrap-up

Oracle 7.3

Partitioned Tables and IndexesPartition PruningParallel Index ScansParallel Insert, Update, DeleteParallel Bitmap Star QueryParallel ANALYZEParallel Constraint EnablingServer Managed Backup/RecoveryPoint-in-Time Recovery

Oracle 8.0

Hash and Composite PartitioningResource ManagerProgress MonitorAdaptive Parallel QueryServer-based Analytic FunctionsMaterialized ViewsTransportable TablespacesDirect Loader APIFunctional IndexesPartition-wise JoinsSecurity Enhancements

Oracle9iList and Range-List PartitioningTable CompressionBitmap Join IndexSelf-Tuning Runtime Memory New Analytic FunctionsGrouping SetsExternal TablesMERGEMulti-Table InsertProactive Query GoverningSystem Managed Undo

Oracle8i

Oracle10gSelf-tuning SQL OptimizationSQL Access AdvisorAutomatic Storage ManagerSelf-tuning MemoryChange Data CaptureSQL ModelsSQL Frequent ItemsetsSQL Partition Outer JoinsStatistical functionsand much more ...

파티션기능(Range)병렬처리(DML)

파티션기능

추가(Hash, Composite)병렬처리

향상Materialized View기능분석합수

기능

파티션기능

추가(List, Range+List)테이블압축

기능External Table 기능분석합수

추가Change Data Capture기능ETL기능(확장

SQL 등)자가관리

DB기능RAC기능향상CDC 기능향상분석함수

기능향상보안기능

향상. . . .

DW를

위한다양한 기능 반영됨

Oracle Database DW Function Roadmap

New in Oracle Database 11g

• Advanced compression• Online upgrades and patching• Real Application Testing• Active Data Guard• Secure Files• Total Recall• Real Application Clusters Performance Management• OLAP-based Materialized Views• Continuous Query Notification• And more…

New Features for BI/DW• VLDB

• Composite Range-Range• Composite List-Range• Composite List-List• Composite List-Hash• REF Partitioning• Virtual Column Partitioning• Compression enhancements

• Performance• Query Result Cache

• Data loading• Change data capture enhancements• Materialized view refresh enhancements

• Manageability • Partition Advisor• Interval Partitioning• SQL Plan Management• Automatic SQL Tuning with Self-Learning Capabilities• Enhanced Optimizer Statistics Maintenance• Multi-Column Optimizer Statistics• ASM Fast Resync, Fast VLDB Startup and other

enhancements

• SQL• SQL Pivot and Unpivot• Continuous Query Notification

• OLAP• Continued database integration

• Cube metadata in the Data Dictionary• Materialized view refresh and SQL rewrite• Fine-grained data security on cubes

• Simplified application development• Fully declarative cube calculations• Cost-Based Aggregation• Simpler calculation definitions

• Data Mining• Simplified development and deployment of models

• Supermodels: data preparation combined with mining model

• Additional packaged predictive analytics • Integration in database dictionary

• New algorithms: “General Linear Models”• Encapsulates several widely used analytic

methods• Multivariate linear regression; logistic regression

Database Result Cache

• Automatically caches results of queries, query blocks, or pl/sql function calls• Cache is shared across statements and sessions on server• Significant speed up for read-only / read-mostly data• Full consistency and proper semantics• Cache refreshed when any underlying table updated

join

Table4 join

Table5 Table6

Q1 is executedGroup-bycache

Q1 result is cachedjoin

join

Table1 Group-by

join

Table2 Table3

Q2: Uses result transparently

cache

Query Result Cache Opportunity• Retail customer data (~50 GB)

• Concurrent users submitting queries randomly• Executive dashboard with 12 heavy analytical queries

• Cache results only at in-line view level• 12 queries run in random, different order – 4 queries cached

• Measure average, total response time for all users

447 s

267 s

186 s

No cache

334 s

201 s

141 s

Cache

25%

25%

24%

Improvement

8

4

2

# Users

Growing Data Volumes

Source: 2005 TopTen Program, November 2005 © Winter Corporation, Waltham, MA, USA

0

20

40

60

80

100

1998 1999 2000 2001 2002 2003 2004 2005

DatabaseSize(TB)

Size of the largest data warehouse in TopTen Programs

245% Increase just from 2003 to 2005

Large-Scale Data Warehouse Features

0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%

DB Res Mgr

RMAN

ASM

Read Only

VPD

MV Use

Compression

Parallel Exec

Partitioning

Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006

Partitioning, parallelism, and compression are the foundation for large-scale data warehousing

Oracle Partitioning Transparent to applications

Large TableDifficult to Manage

PartitionDivide and ConquerEasier to ManageImprove Performance

Composite PartitionBetter PerformanceMore flexibility to match business needs

JAN FEBJAN FEB

USA

EUROPEORDERSORDERS

ORDERS

Interval Partitioning

JAN FEB MAR APR

ORDERS

JAN FEB

ORDERS

MARJAN FEB

INVENTORY

• Partitions are created automatically as data arrives

Composite Partitioning

• Two-dimensional partitioning schemes• Extensions in Oracle Database 11g

• e.g. List-range:• Partition by country, then by week• Partition by line-of-business, then by week

Range List HashRange 11g 9i 8iList 11g 11g 11g

Before REF Partitioning

Table ORDERS

Jan 2006

... ...

Feb 2006

Table LINEITEMS

Jan 2006

... ...

Feb 2006

• RANGE(order_date)• Primary key order_id

• RANGE(order_date)• Foreign key order_id

• Redundant storage of order_date• Redundant maintenance

With REF PartitioningTable ORDERS

Jan 2006

... ...

Feb 2006

Table LINEITEMS

Jan 2006

... ...

Feb 2006

• RANGE(order_date)• Primary key order_id

• RANGE(order_date)• Foreign key order_id• RANGE(order_date)• Foreign key order_id

PARTITION BY REFERENCE• Partitioning key inherited through

PK-FK relationship

Virtual Column Partitioning

ORDERSORDER_ID ORDER_DATE CUSTOMER_ID...---------- ----------- ----------- --9834-US-14 12-JAN-2007 659208300-EU-97 14-FEB-2007 396543886-EU-02 16-JAN-2007 45292566-US-94 19-JAN-2007 153273699-US-63 02-FEB-2007 18733

JAN FEB

USA

EUROPEORDERS• REGION requires no storage• Partition by ORDER_DATE, REGION

REGION AS (SUBSTR(ORDER_ID,6,2)) ------

USEUEUUSUS

Compression

• Tables and indexes can be compressed• Can be specified on a per-partition basis• Typical compression ratio 3:1

• Requires more CPU to load data• Decompression hardly costs resources• Compress for all DML operations

• Less data on disk• Requires less time to read• Little or no impact on query performance

• Completely transparent 3XCompression

Information Lifecycle Management

OrdersQ1

Orders

Q2Orders

Q3Orders

Q4Orders

OlderOrders

ActiveHigh PerformanceStorage Tier

Less ActiveLow Cost Storage Tier

HistoricalOnline Archive Storage Tier

Data Mining

OLAP Statistics

SQL Analytics

• Bring the analytics to the data• Leverage core database infrastructure

Embedded Analytics

What is the Oracle OLAP Option? Embedded, manageable, enterprise-ready

• A multidimensional calculation and aggregation ‘engine’• Multidimensional data types: Cubes and dimensions• An OLAP API for cube definition and multidimensional queries• A SQL interface to OLAP Cubes and dimensions• OLAP-based materialized views

• Embedded in the Oracle Database• Runs within Oracle instance on same server • OLAP cubes are stored in Oracle data files• OLAP metadata in the Oracle Data Dictionary

• All key database functionality extends to OLAP• Security: Object and data security• Scalability: Real Application Clusters• Availability: Backup-Recovery, Disaster Recovery, RAC• Administration: Enterprise Manager

Oracle Database 11g The OLAP Goal

• Improve business intelligence applications by:• Accelerating query performance• Completely transparent to the application• Delivering rich analytic content • Advanced analytic calculations using simple SQL

Oracle OLAP in every data warehouse

Materialized ViewsSales

by RegionSales

by Date

Sales by Product

Sales by Channel

Query Rewrite

Materialized Views Typical Architecture Today

Region Date

Product Channel

SQL Query

Relational Star Schema

New in Oracle Database 11g Cube Organized Materialized Views

Materialized Views

Region Date

Product Channel

SQL Query

Query Rewrite

Automatic Refresh

OLAP Cube

Oracle OLAP in Oracle Database 11g

• Improves BI applications• Optimized for fast query response and fast refresh

• Database-managed summary management• Embedded BI calculations and metadata

• Accessible by any application• SQL or OLAP API based

• Economies of scale• Consolidate on grid infrastructure

Oracle OLAP Customers

Note – Not all customers are references. Internal use only

What is the Oracle Data Mining Option? Embedded, manageable, enterprise-ready

• A complete data-mining solution• 10+ data-mining algorithms: addresses every business scenario• Oracle Data Miner: Graphical user-interface for analysts• A Data Mining Java API for application developers• A Data Mining PL/SQL API for database developers

• Embedded in the Oracle Database• Runs within Oracle instance on same server • Data Mining algorithms execute directly on database tables• Data Mining models and metadata stored within Oracle database

• All key database functionality extends to Data Mining• Scalability: Parallelism, Real Application Clusters• Security: Object and data security• Administration: Enterprise Manager• Portability: All major platforms

Data Mining Partnerships Enabling faster solutions

• Integration with major ISV’s:• SPSS

• Clementine supports Oracle Data Mining• InforSense

• InforSense supports Oracle Data Mining• SAP

• BW Connector to Oracle Data Mining• Consulting Partner Program

• Growing network of 75 external DM experts• Help in ODM pre-sales & delivering “solutions”• Notable DM experts:

• Michael Berrry, author best-selling DM books• Gregory Piatetsky-Shapiro• Karl Rexer, Skilled & entrepreneurial• KC Cerney, MIA Consulting• Dean Abbott, TDWI presenter

What is Oracle Warehouse Builder?• Data integration and data quality tool embedded within Oracle Database

• Integrate data from disparate data sources• High performance (ELT & ETL architecture) optimized for Oracle Database• Extreme scalability with support for RAC• Declarative, highly productive drag and drop design capabilities• ERP/CRM connectivity (EBS, Siebel, SAP and PeopleSoft)

• Improve enterprise data quality• Name and Address cleansing• Fuzzy matching and merging• Data Profiling and data auditing• Business rule support in profiling and ETL

• Provide data modeling capabilities for Oracle• Relational (tables, views, MVs, UDT, etc.)• Dimensional (slowly changing dimenions, OLAP support, etc.)

• Enterprise metadata management• End-to-end attribute level lineage and impact analysis• Extensible repository to hold any metadata

OWB Market Position

Oracle Warehouse Builder Packaging

Oracle Warehouse Builder

• Part of Oracle Database 11g install• Embedded within the database installer• OWB metadata repository installed by default

• New features include:• Siebel Connector• Enhanced SQL generation• Expanded dimensional support• Graphical creation of views and materialized views

High-level Roadmap• Best support for OWB for Fusion Applications

• OWB supports native Heterogeneous Targets

• OWB is SOA enabled

• OWB supports direct integration with OBI EE

• Official Roadmap Document:http://www.oracle.com/technology/products/warehouse/pdf/TWP_BIDW_OWB_Roadmap_1006.pdf

Customer Success Select Oracle OWB Customers

Agenda

• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse

and Business Intelligence• Wrap-up

Why Oracle Database 11g for Data Warehousing?

• Best-of-breed for data warehouses and data marts• Consolidation platform

• eliminates the costs and inefficiencies of multiple data stores

• Complete, integrated ELT and data-quality capabilities• Unmatched analytic capabilities

• embedded OLAP, data-mining and statistics all accessible via SQL

• Optimized for grid computing• Next-generation BI with integrated OLTP and DSS• Information Appliances on Industry Standard Hardware• Foundation of Oracle Business Intelligence solutions

For More Information

www.oracledatabase11g.com

Look out for

A dedicated microsite for Oracle Database 11g with more information, assets and the latest updates

(To be launched in early October 07)