· • oracle database 11g new features for data warehouse and business intelligence • wrap-up....
TRANSCRIPT
<Insert Picture Here>
Building Scalable and High-Performance Data WarehouseJungok NahPrincipal Consultant
Agenda
• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse
and Business Intelligence• Wrap-up
Agenda
• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse
and Business Intelligence• Wrap-up
Trends of the Information System by period
1995 1996 1997 1998 1999 2000
2001 2002 2003 2004 2005 2006 / 2007
정보계구축
---> Data Warehouse 구축
콜센터
구축
영업지원
시스템
구축
eCRM
구축
고객
분석모델
개발
고객
세그멘테이션
고객
LTV모델
개발
DW 리모델링 / 고도화 / 재구축
통합
CRM 구축
분석/운영 CRM 연계
현업에서의
활용성보다,DW구축하기위한
프로젝트에
집중
각
고객
접점
채널별,최적화된
채널구축에집중
체계화된
데이터
구조를기반으로, 현업의
활용성을극대화한
DW체계를갖추기
위한
재구축
기개발된
고객접점채널을
통합하고, 이를
분석CRM과연동하는데
초점
차
세
대
R
T
E
데이터
축적
분석위주
E D W
인프라
실시간
구분 EIS/DSS BI/DW Next Generation
Main Focus Insight for small Single version of truth Single version of truth across trading partners
Audience Senior management Internal stakeholder All stakeholder
Stated Business Driver Drowning in paper Drowning in data Data is too latent for the decision- making process
Typical Data Volume gigabyte terabyte terabyte,exabyteData Organization Simple Star schema or MMDB Hybrid relational/Object
Data Quality Surfaced through EIS&DSS Baked in at Start of effort Across stakeholder and business unit mandatory
Data Quality Approach Passive Reactive Proactive
Early Signal Requiring Senior management buried in paper
Stovepipe or Island of BI applications
Disjointed portal, BPM,BAM, Predictive modeling, BI/DW applications &Operational systems
Data Refresh Rate Monthly Daily Near Real Time
Data Provisioning Simple scripted process E T LMerged ETL/EAI/EII/BPM/
Change Data Capture
Business Intelligence Reporting/OLAP Scorecard/DashboardPredictive analysis/Alert &
NotificationEducational Vehicle Specialized development team Center of excellence To be evolved
* Next Generation System (TDWI 2005. 10 )
Staging 5Active
Staging 4Operational
Staging 3Predictive
Staging 2Analytical
Staging 1Reporting
BATCH RIGHT-TIME NEAR-REAL-TIME REAL-TIMEWEEKS DAYS HOURS MINUTES SECONDS SUB-SECONDS
INACTIVE REACTIVE PROACTIVE AUTONOMOUSWHAT
HAPPENED ?WHAT DO ITHAPPEN ?
WHAT WILLHAPPEN ?
WHAT ISHAPPENING ?
WHAT DO I WANTTO HAPPEN ?
* DW Maturity Step
The Next Generation Information System
Source: TDWI
Source: TDWI
*Data Integration / *Consolidation / *Federation(고객 / 채널 ) + 통합방법 + 표준화체계*Data Integration / *Consolidation / *Federation(고객 / 채널 ) + 통합방법 + 표준화체계
데이터 통합 측면데이터 통합 측면
Data Warehousing / Architecture / Real-Time DWData Warehousing / Architecture / Real-Time DW정보제공 인프라 측면정보제공 인프라 측면
OLAP Architecture / Portal, Dashboard / 시스템연계OLAP Architecture / Portal, Dashboard / 시스템연계정보활용 측면정보활용 측면
C P M ( Corporate Performance Management )C P M ( Corporate Performance Management )성과분석 측면성과분석 측면
11
22
33
44
DW Maturity Model & ROI
Require Single View with Enterprise- wide integrated Data
휴대폰
자택전화 Fax
직장전화 Fax
전자우편 수신거부
자택주소
직장주소
채널경로 우수고객 발신자번호
고객명주민/사업자번호
찾기찾기지움지움
고객등급
계약건수
개인고객구분
직원여부고객만족도
고객사항 개인고객개인고객 법인고객법인고객 세대정보세대정보 가망고객가망고객 고객의소리고객의소리 e-서비스e-서비스 팩스발송팩스발송 메일발송메일발송
고객구분 고객분석
연락처 주소
계약사항 캠페인
접촉이력 지원/상담내역
접촉이력
계약/계좌
서비스
캠페인
소재지접촉
포인트
고객
정보
관계정보분석
정보
통합된데이터
고객
고객 고객
고객
Account
고객
영역
(기간업무,홈페이지…)
계약
정보(기간업무)
접촉이력(콜센터,홈페이지… )
캠페인관리
시스템
영업지원
시스템
분석영역(EDW,DM…)
휴대폰자택전
화Fax
직장전
화Fax
전자우
편
수신거
부
자택주
소직장주
소
채널경
로우수고
객발신자번
호
고객명주민/사업자
번호
찾기찾기지움지움고객등
급계약건
수개인고객구
분
직원여
부 고객만
족도
고객사
항
개인고객개인고객 법인고객법인고객 세대정보세대정보 가망고객가망고객 고객의소리고객의소리e-서비스e-서비스 팩스발송팩스발송 메일발송메일발송
고객구분 고객분석
연락처 주소
계약사항 캠페인
접촉이력 지원/상담내역
접촉이력
계약
서비스
캠페인
소재지접촉
포인트
고객정보관계정보
분석
정보
고객
+ 계좌(상품) + 분석
+ 접촉
+ 상담
+ 서비스
관계
연락처
고객
접촉이력 상담내역
고객
거래내역
반응정보
캠페인
통계정보
고객분석정보
영업서비스
서비스요청
실시간실시간 + + 배치배치 실시간실시간 + + 배치배치
데이터데이터
통합통합
=> => Data Integration ? Consolidation? Federation?Data Integration ? Consolidation? Federation?
Agenda
• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse
and Business Intelligence• Wrap-up
Comprehensive Database Platform
• For business intelligence and data warehousing• Industry-leading performance and scalability• Deep integrated analytics• Embedded ELT and data quality
• All running on a low-cost enterprise grid• Reliable and secure• Easy to manage
Data Warehouse Product Portfolio
Industry-leading performance and scalabilityOracle DatabaseOracle PartitioningOracle Real Application Clusters
Deep integrated analytics Oracle OLAPOracle Data Mining
Embedded data quality and integrationOracle Warehouse Builder (OWB)OWB Enterprise ETLOWB Data QualityOWB Connectors
Data Warehousing Strategy
• Best database for BI/DW solutions• No other database can compare on the breadth and sophistication
of Oracle’s database features
• Complete solution portfolio• Complete database platform including ELT and Analytics• Oracle BI and Performance Management solutions• Broadest array of third-party technologies and solutions
• On the right hardware infrastructure• Offer customers a choice of solutions tailored to meet needs
Gateway
Mainframe Unix
[ 데이터 설계 ]Oracle Designer
orOther Case Tool
[ ETL관리 ]: Oracle Warehouse Builder[ DB 관리 ]: Oracle Enterprise Manager
[ 데이터 설계 ]Oracle Designer
orOther Case Tool
[ 데이터 변환 ]Merge(Upsert)
Multitable InsertTable Function
Parallel DDL/DML
[ 구축/운영
]3rd Normal Form
Multi-Dimensional ModelMaterialized View
Partition tableParallel ProcessingAnalytical Function
Compress
[ 데이터 파티셔닝 ]Partition table
Partition Exchange LoadingCompress
Staging/ODS
TargetDW
TransformationERP
System
[데이터추출 및 적재]SQL*Loader
Transportable TablespaceExternal table
Change Data Capture Streams
Materialized ViewOracle Gateway
Oracle DBMS DW Function Architecture
Key Features for VLDB
• 다양한
분할기법• 디스크
장애
시
해당
Partition만 영향
• 개별
Partition 단위
의 관리 기능(DML, Load, Import, Export, Exchange 등) 지원
• 사용자
질의시
해당
되는
파티션만
Access
• 로드
밸런싱
• 7*24 작업가능
파티션테이블/인덱스
파티션테이블/인덱스
• 블록내의
중복제거• 질의성능
향상• 스토리지
절약• 테이블
및
테이블스
페이스, 파티션레벨
에서
압축
지정가능• Materialized View도
압축
가능
데이터압축
데이터압축
• 질의성능향상
및
복
제를
위한
기능• 자주
쓰이는
유형의
데이터
Set을 미리
수행 한 후 테이블에
저장
• Query Rewrite기능• 주기적인
요약정보
갱신
가능
요약View
요약View
• Parallel Query• Parallel Load (with
SQL*Loader Utility)• Parallel DDL, DML• Parallel Recovery• Parallel Analyze• Parallel Read from
External table• Parallel Propagation
(Replication)• Parallel-wise Join• Parallel Server (with
RAC)
병렬프로세싱
병렬프로세싱
• 변경데이터를
정보
분석요구주기에
따
라
곧바로
적용할
수
있는
다양한
CDC(Change Data Capture) 기능
제공
• SAM File을 곧바로
DBMS의
테이블처럼
인식하여
SQL을 적
용할 수 있는 기능
제공
• 소스시스템의
상황
에 따라 적용할 수
있는
다양한
ETL 기
능 제공
데이터공유
데이터공유
대용량
데이터의
처리
Multi-Terabyte Data WarehousesCPG / Transportation
Retail Communications Financial Services Energy Manufacturing
Oracle Information Appliance Initiative
Goals for Oracle data warehouse solutions:
• Provide superior system performance• Provide a superior customer experience
Choice of Configurations
• Database Options
• Management Packs
FoundationFoundation• Documented best-practice
configurations for data warehousing
• Benefits:High performanceSimple to scale;
modular building blocksIndustry-leading
database and hardwareAvailable today with HP,
IBM, Sun, EMC/Dell
• Flexibility for the most demanding data warehouse
• Benefits:High performanceUnlimited scalabilityCompletely
customizableIndustry-leading
database and hardware
CustomCustom
• Database Options
• Management Packs
• Partitioning• RAC
ApplianceAppliance• Scalable systems pre-
installed and pre- configured: ready to run out-of-the-box
• Benefits:High performanceSimple to buyFast to implementEasy to maintainCompetitively priced
Flexibility
Pre-configured, Pre-installed, Validated
Agenda
• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data
Warehouse and Business Intelligence• Wrap-up
Oracle 7.3
Partitioned Tables and IndexesPartition PruningParallel Index ScansParallel Insert, Update, DeleteParallel Bitmap Star QueryParallel ANALYZEParallel Constraint EnablingServer Managed Backup/RecoveryPoint-in-Time Recovery
Oracle 8.0
Hash and Composite PartitioningResource ManagerProgress MonitorAdaptive Parallel QueryServer-based Analytic FunctionsMaterialized ViewsTransportable TablespacesDirect Loader APIFunctional IndexesPartition-wise JoinsSecurity Enhancements
Oracle9iList and Range-List PartitioningTable CompressionBitmap Join IndexSelf-Tuning Runtime Memory New Analytic FunctionsGrouping SetsExternal TablesMERGEMulti-Table InsertProactive Query GoverningSystem Managed Undo
Oracle8i
Oracle10gSelf-tuning SQL OptimizationSQL Access AdvisorAutomatic Storage ManagerSelf-tuning MemoryChange Data CaptureSQL ModelsSQL Frequent ItemsetsSQL Partition Outer JoinsStatistical functionsand much more ...
파티션기능(Range)병렬처리(DML)
파티션기능
추가(Hash, Composite)병렬처리
향상Materialized View기능분석합수
기능
파티션기능
추가(List, Range+List)테이블압축
기능External Table 기능분석합수
추가Change Data Capture기능ETL기능(확장
SQL 등)자가관리
DB기능RAC기능향상CDC 기능향상분석함수
기능향상보안기능
향상. . . .
DW를
위한다양한 기능 반영됨
Oracle Database DW Function Roadmap
New in Oracle Database 11g
• Advanced compression• Online upgrades and patching• Real Application Testing• Active Data Guard• Secure Files• Total Recall• Real Application Clusters Performance Management• OLAP-based Materialized Views• Continuous Query Notification• And more…
New Features for BI/DW• VLDB
• Composite Range-Range• Composite List-Range• Composite List-List• Composite List-Hash• REF Partitioning• Virtual Column Partitioning• Compression enhancements
• Performance• Query Result Cache
• Data loading• Change data capture enhancements• Materialized view refresh enhancements
• Manageability • Partition Advisor• Interval Partitioning• SQL Plan Management• Automatic SQL Tuning with Self-Learning Capabilities• Enhanced Optimizer Statistics Maintenance• Multi-Column Optimizer Statistics• ASM Fast Resync, Fast VLDB Startup and other
enhancements
• SQL• SQL Pivot and Unpivot• Continuous Query Notification
• OLAP• Continued database integration
• Cube metadata in the Data Dictionary• Materialized view refresh and SQL rewrite• Fine-grained data security on cubes
• Simplified application development• Fully declarative cube calculations• Cost-Based Aggregation• Simpler calculation definitions
• Data Mining• Simplified development and deployment of models
• Supermodels: data preparation combined with mining model
• Additional packaged predictive analytics • Integration in database dictionary
• New algorithms: “General Linear Models”• Encapsulates several widely used analytic
methods• Multivariate linear regression; logistic regression
Database Result Cache
• Automatically caches results of queries, query blocks, or pl/sql function calls• Cache is shared across statements and sessions on server• Significant speed up for read-only / read-mostly data• Full consistency and proper semantics• Cache refreshed when any underlying table updated
join
Table4 join
Table5 Table6
Q1 is executedGroup-bycache
Q1 result is cachedjoin
join
Table1 Group-by
join
Table2 Table3
Q2: Uses result transparently
cache
Query Result Cache Opportunity• Retail customer data (~50 GB)
• Concurrent users submitting queries randomly• Executive dashboard with 12 heavy analytical queries
• Cache results only at in-line view level• 12 queries run in random, different order – 4 queries cached
• Measure average, total response time for all users
447 s
267 s
186 s
No cache
334 s
201 s
141 s
Cache
25%
25%
24%
Improvement
8
4
2
# Users
Growing Data Volumes
Source: 2005 TopTen Program, November 2005 © Winter Corporation, Waltham, MA, USA
0
20
40
60
80
100
1998 1999 2000 2001 2002 2003 2004 2005
DatabaseSize(TB)
Size of the largest data warehouse in TopTen Programs
245% Increase just from 2003 to 2005
Large-Scale Data Warehouse Features
0% 10% 20% 30% 40% 50% 60% 70% 80% 90% 100%
DB Res Mgr
RMAN
ASM
Read Only
VPD
MV Use
Compression
Parallel Exec
Partitioning
Source: TB Club Report: A survey of 30 multi-TB Oracle DW’s – data July 2006
Partitioning, parallelism, and compression are the foundation for large-scale data warehousing
Oracle Partitioning Transparent to applications
Large TableDifficult to Manage
PartitionDivide and ConquerEasier to ManageImprove Performance
Composite PartitionBetter PerformanceMore flexibility to match business needs
JAN FEBJAN FEB
USA
EUROPEORDERSORDERS
ORDERS
Interval Partitioning
JAN FEB MAR APR
ORDERS
JAN FEB
ORDERS
MARJAN FEB
INVENTORY
• Partitions are created automatically as data arrives
Composite Partitioning
• Two-dimensional partitioning schemes• Extensions in Oracle Database 11g
• e.g. List-range:• Partition by country, then by week• Partition by line-of-business, then by week
Range List HashRange 11g 9i 8iList 11g 11g 11g
Before REF Partitioning
Table ORDERS
Jan 2006
... ...
Feb 2006
Table LINEITEMS
Jan 2006
... ...
Feb 2006
• RANGE(order_date)• Primary key order_id
• RANGE(order_date)• Foreign key order_id
• Redundant storage of order_date• Redundant maintenance
With REF PartitioningTable ORDERS
Jan 2006
... ...
Feb 2006
Table LINEITEMS
Jan 2006
... ...
Feb 2006
• RANGE(order_date)• Primary key order_id
• RANGE(order_date)• Foreign key order_id• RANGE(order_date)• Foreign key order_id
PARTITION BY REFERENCE• Partitioning key inherited through
PK-FK relationship
Virtual Column Partitioning
ORDERSORDER_ID ORDER_DATE CUSTOMER_ID...---------- ----------- ----------- --9834-US-14 12-JAN-2007 659208300-EU-97 14-FEB-2007 396543886-EU-02 16-JAN-2007 45292566-US-94 19-JAN-2007 153273699-US-63 02-FEB-2007 18733
JAN FEB
USA
EUROPEORDERS• REGION requires no storage• Partition by ORDER_DATE, REGION
REGION AS (SUBSTR(ORDER_ID,6,2)) ------
USEUEUUSUS
Compression
• Tables and indexes can be compressed• Can be specified on a per-partition basis• Typical compression ratio 3:1
• Requires more CPU to load data• Decompression hardly costs resources• Compress for all DML operations
• Less data on disk• Requires less time to read• Little or no impact on query performance
• Completely transparent 3XCompression
Information Lifecycle Management
OrdersQ1
Orders
Q2Orders
Q3Orders
Q4Orders
OlderOrders
ActiveHigh PerformanceStorage Tier
Less ActiveLow Cost Storage Tier
HistoricalOnline Archive Storage Tier
Data Mining
OLAP Statistics
SQL Analytics
• Bring the analytics to the data• Leverage core database infrastructure
Embedded Analytics
What is the Oracle OLAP Option? Embedded, manageable, enterprise-ready
• A multidimensional calculation and aggregation ‘engine’• Multidimensional data types: Cubes and dimensions• An OLAP API for cube definition and multidimensional queries• A SQL interface to OLAP Cubes and dimensions• OLAP-based materialized views
• Embedded in the Oracle Database• Runs within Oracle instance on same server • OLAP cubes are stored in Oracle data files• OLAP metadata in the Oracle Data Dictionary
• All key database functionality extends to OLAP• Security: Object and data security• Scalability: Real Application Clusters• Availability: Backup-Recovery, Disaster Recovery, RAC• Administration: Enterprise Manager
Oracle Database 11g The OLAP Goal
• Improve business intelligence applications by:• Accelerating query performance• Completely transparent to the application• Delivering rich analytic content • Advanced analytic calculations using simple SQL
Oracle OLAP in every data warehouse
Materialized ViewsSales
by RegionSales
by Date
Sales by Product
Sales by Channel
Query Rewrite
Materialized Views Typical Architecture Today
Region Date
Product Channel
SQL Query
Relational Star Schema
New in Oracle Database 11g Cube Organized Materialized Views
Materialized Views
Region Date
Product Channel
SQL Query
Query Rewrite
Automatic Refresh
OLAP Cube
Oracle OLAP in Oracle Database 11g
• Improves BI applications• Optimized for fast query response and fast refresh
• Database-managed summary management• Embedded BI calculations and metadata
• Accessible by any application• SQL or OLAP API based
• Economies of scale• Consolidate on grid infrastructure
What is the Oracle Data Mining Option? Embedded, manageable, enterprise-ready
• A complete data-mining solution• 10+ data-mining algorithms: addresses every business scenario• Oracle Data Miner: Graphical user-interface for analysts• A Data Mining Java API for application developers• A Data Mining PL/SQL API for database developers
• Embedded in the Oracle Database• Runs within Oracle instance on same server • Data Mining algorithms execute directly on database tables• Data Mining models and metadata stored within Oracle database
• All key database functionality extends to Data Mining• Scalability: Parallelism, Real Application Clusters• Security: Object and data security• Administration: Enterprise Manager• Portability: All major platforms
Data Mining Partnerships Enabling faster solutions
• Integration with major ISV’s:• SPSS
• Clementine supports Oracle Data Mining• InforSense
• InforSense supports Oracle Data Mining• SAP
• BW Connector to Oracle Data Mining• Consulting Partner Program
• Growing network of 75 external DM experts• Help in ODM pre-sales & delivering “solutions”• Notable DM experts:
• Michael Berrry, author best-selling DM books• Gregory Piatetsky-Shapiro• Karl Rexer, Skilled & entrepreneurial• KC Cerney, MIA Consulting• Dean Abbott, TDWI presenter
What is Oracle Warehouse Builder?• Data integration and data quality tool embedded within Oracle Database
• Integrate data from disparate data sources• High performance (ELT & ETL architecture) optimized for Oracle Database• Extreme scalability with support for RAC• Declarative, highly productive drag and drop design capabilities• ERP/CRM connectivity (EBS, Siebel, SAP and PeopleSoft)
• Improve enterprise data quality• Name and Address cleansing• Fuzzy matching and merging• Data Profiling and data auditing• Business rule support in profiling and ETL
• Provide data modeling capabilities for Oracle• Relational (tables, views, MVs, UDT, etc.)• Dimensional (slowly changing dimenions, OLAP support, etc.)
• Enterprise metadata management• End-to-end attribute level lineage and impact analysis• Extensible repository to hold any metadata
Oracle Warehouse Builder
• Part of Oracle Database 11g install• Embedded within the database installer• OWB metadata repository installed by default
• New features include:• Siebel Connector• Enhanced SQL generation• Expanded dimensional support• Graphical creation of views and materialized views
High-level Roadmap• Best support for OWB for Fusion Applications
• OWB supports native Heterogeneous Targets
• OWB is SOA enabled
• OWB supports direct integration with OBI EE
• Official Roadmap Document:http://www.oracle.com/technology/products/warehouse/pdf/TWP_BIDW_OWB_Roadmap_1006.pdf
Agenda
• Today’s Information System• Oracle Database for Data Warehousing• Oracle Database 11g New Features for Data Warehouse
and Business Intelligence• Wrap-up
Why Oracle Database 11g for Data Warehousing?
• Best-of-breed for data warehouses and data marts• Consolidation platform
• eliminates the costs and inefficiencies of multiple data stores
• Complete, integrated ELT and data-quality capabilities• Unmatched analytic capabilities
• embedded OLAP, data-mining and statistics all accessible via SQL
• Optimized for grid computing• Next-generation BI with integrated OLTP and DSS• Information Appliances on Industry Standard Hardware• Foundation of Oracle Business Intelligence solutions
For More Information
www.oracledatabase11g.com
Look out for
A dedicated microsite for Oracle Database 11g with more information, assets and the latest updates
(To be launched in early October 07)