![Page 1: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/1.jpg)
©2011 Hewlett-Packard Development Company, L.P.
The information contained herein is subject to change without
notice
Present and Future of Enterprise BI
January 17, 2013
Prepared for DAMA
![Page 2: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/2.jpg)
Agenda
1. DB engines for BI and contrast with OLTP DB engines • Row-DB, column-DB, index/non-index, in-memory • Contrasts BI workload with OLTP/ERP workload • Our experience
2. MPP systems for BI: shared-nothing and shared-disk • IQ, HANA, Exadata, Netezza, MySQL • Our experience
3. Including unstructured data (“Big Data”) in BI • Our experience
4. Design the “dream” BI system 5. Comparing various “real” and “dream” BI systems
• Facts and our experience 6. Brief EDMT BI Overview 7. Open Discussion
![Page 3: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/3.jpg)
Background
BMMsoft offers consulting and BI products/solutions for: • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , MySQL • Extending enterprise BI with Big Data • HA, SLA(“how many-nines?”), DR, B/R solutions for BI systems • Scale and speed: “World’s Largest DW” (2002, 2004, 2007) and “World’s Fastest Data
Loader”(2011, 330 TB/day on HP DL980)
Paul Krneta, CTO of BMMsoft, • 20 years of industry experience in computer and database technology and architecture • CTO of Sybase IQ 2000-2007
• architected the MPP option for Sybase IQ (“IQ Multiplex”) • designed NonStopIQ (HA, DR and B/R for VLDB version of Sybase IQ) • optimized IQ for VLDB, certified IQ 3 times as the "World's Largest Data Warehouse“
• 2002 – 48 TB (200 B rows) • 2004 - 150 TB ( 1 T rows) • 2007 – 1,030 TB (1PB in 6 T rows) of structured and (opt) unstructured data
• Technical Director for DB Technology at Digital Equipment (DEC) 1994-2000 • Designed first In-memory DB: Oracle VLM Option (“Very Large Memory”) in 1995 • 1 TB/hour live backup of Oracle, Sybase, Informix, SQL Server, Adabas in 1995-1996
![Page 4: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/4.jpg)
DB LANDSCAPE:
BI VS. OLTP
![Page 5: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/5.jpg)
Categories of DBs for BI
Different DB architectures : 1. R=row-oriented DB 2. C = columnar DB 3. H-RC= hybrid row+columnar DB 4. Compression 5. NI=non-indexed DB 6. I = indexed DB 7. MPP-SN shared nothing DB 8. MPP-SD shared-disk DB 9. In-Memory DB 10. SQL, NoSQL, object, KV-pair = DB 11. ACID and non-ACID “unreliable” DB 12. HA, DR, B/R, Test/Dev 13. BLOB storage: in-row/column,
separate store, external BLOB 14. Text search: in-db or external 15. UDF 16. Storage efficiency, Green
Types of queries: 1. Pin-point query
• Interested in small # of rows selected from Bs and Ts of rows – i.e. call center, ATM etc.
2. Analytic query • Analyzes Ms, Bs, Ts rows (1%-100% ) of the
entire DB
3. Mixed search of structured+unstructured data • Single query cross-searches SQL rows and text • Best: both queries done using single engine • Worst: two engines (one SQL, one Text)=relict of
“divided SQL/text world”
4. Text search/analytics 5. OLTP (heavy updates)
1-5 day sessions: “BI for Today and Tomorrow”, “NonStopIQ bootcamp”, “BI Assessment” http://download.sybase.com/presentation/TW2005/AM21.pdf
![Page 6: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/6.jpg)
Quick overview of “original” row DB
1. Record (“row”) has multiple fields (i.e. date, name, amount etc.) 1) Fields of a row placed next to each other (on disk and in RAM) 2) Each filed has single value (typ.) 3) Order of fields in DDL is irrelevant (sort of) 4) To get to Nth field in the row, DB “scans” each of previous N fields
2. DB page contains multiple unrelated) records 1) DB page is unit of storage management, IO and caching in RAM 2) Row (typ.) can’t span multiple pages= limits # fields, length of row
3. ACID applied all the time 4. Locking at record level 5. Small number of fields indexed
1 2 3 4 …...….. 100
Row DB
DB page(“block”): 2-32KB
SQL: Create table ABC yellow, blue, red, magenta
SQL: Select sum (red) from
ABC
![Page 7: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/7.jpg)
Row-DB vs. Columnar DB
1 2 3 4 …...….. 100
Columnar DB
SQL: Select sum (red) from
ABC
Row DB SQL: Create table ABC
yellow, blue, red, magenta
1. Both use ANSI SQL & ODBC/JDBC 2. Column structure (invisible to apps and admin)
• Reduces I/O by 90-99% (eliminates full-table-scans) • Flex schema = add/remove columns on the fly • wide tables=simple, rich schema (i.e +42,000 column) • Large I/O can use large(+400GB), low-cost disks • Great match for BLOB data (image, video, email, docs…)
3. All row DBs have indices (almost unusable w/o indices) 4. Column DB w/ indices : bitmap, bit-wise, text +more
• Column+index queries 2x-1,000x faster than “classic” DBMS • Fast to load, small size, have all data statistics
5. Data Compression 90% cost reduction • “Row DB” is 4x-10x larger than Column DB • Disks for row DB costs 8x-20x more • Fast, no fragmentation, always ON, no LVM nor FS
6. Multi-node • Multi-
Db page
2-32KB
1 2 3 4 ………. 100
Db page
512KB
![Page 8: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/8.jpg)
BI/DW vs. OLTP S
pe
ed
, S
ca
lab
ility
(#
use
r &
da
ta s
ize
)
OLTP = simple query • “touch/update” 10s of rows per query
• query takes seconds and few resources
• simple SQL statements
DSS = complex query • “touch” Th-M-(B-Tr)illions of rows
• query takes sec-hours to finish • Complex (10-page) SQL statements
• (typ.) 10x larger than OLTP DBMS
VLDB
10,000
1000
100
10
1
In-memory DB (HANA)
![Page 9: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/9.jpg)
To Index or not to Index ?
1 2 3 4 …...….. 100
Column DB
SQL: Select sum (red) from
ABC
Row DB SQL: Create table ABC
yellow, blue, red, magenta
1. Row DB: Index is critical to avoid slow, costly full-table scans • Reduces I/O by 90-99% (eliminates full-table-scans)
2. Column DB without indices • Every query scans column(s) slow, heavy I/O & CPU load • Complex queries scan many columns (=much of a DB) • May be faster (but not much) to load • Uses less space (but needs faster disks for scans)
3. Column DB w/ indices : bitmap, bit-wise, text +more • Many queries use index only (=fast, low I/O, CPU use) • Indices have statistics about data = better QEP • No scans = Reduced I/O • Large I/O = use large(4TB), low-cost disks ($400/TB)
Db page
2-32KB
1 2 3 4 ………. 100
Db page
512KB
![Page 10: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/10.jpg)
BI: Reporting vs. Advanced(Ad-hoc) S
pe
ed
, S
ca
lab
ility
(#
use
r &
da
ta s
ize
)
Reporting • “interested” in many rows per query
• predictable queries
Advanced (ad-hoc)query • “touch” Th-M-(B-Tr)illions of rows • query takes sec-hours to finish • unpredictable, complex queries
COLUMN DB with index
VLDB
10,000
1000
100
10
1
In-memory DB (SAP HANA)
![Page 11: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/11.jpg)
BI: Data Scalability S
pe
ed
, S
ca
lab
ility
(#
use
r &
da
ta s
ize
)
DB size (TB, PB) # columns
COLUMN DB with index
“row” DBs
“row” DBs with HW “column” filters
10,000
1000
100
10
1
In-memory DB (HANA)
![Page 12: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/12.jpg)
BI: Resource consumption R
eso
urc
e u
sa
ge
CP
U, R
AM
, I
OP
S a
nd
BW
DB size (TB, PB) # columns
10,000
1000
100
10
1
![Page 13: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/13.jpg)
BI: Speed and efficiency P
erf
orm
an
ce
Re
so
urc
e e
ffic
ien
cy (
CP
U, R
AM
. IO
COLUMN DB with index
10,000
1000
100
10
1
Predictable/static Data and queries
UN-Predictable Data and queries
In-memory DB (HANA)
![Page 14: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/14.jpg)
In-memory DB (OLTP & BI): SAP HANA
1 2 3 4 …...….. 100
Column DB
SQL: Select sum (red) from
ABC
Row DB SQL: Create table ABC
yellow, blue, red, magenta
1. HANA: ANSI SQL and odbc/jdbc 2. HANA: compression is always on, 5:1 – 20:1
• Single HANA server (4 TB RAM) can hold 15-60 TB of data • No transactional I/O to disk (except log file and start/stop) • Row or column DB (at table level)
3. HANA : much more than DB cache in RAM • Data access optimized for RAM • Supports multi-node configurations • 100s and 1,000s times faster than “std” on-disk row DB
4. HANA : RAM -DB for 0.1TB – 50 TB data (even more) • Good fit for complex, real-time BI/OLTP/ERP workloads • Benefits from cheap/big RAM and fast CPUs • Pricey (“too fast”?) for huge “warm/cold” data (+100TB?) • HANA+IQ = good mix of in-memory and on-disk DB
Db page = 2-32KB
1 2 3 4 ………. 100
Db page = 512KB
![Page 15: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/15.jpg)
MASSIVELY PARALLEL PROCESSING - MPP
(“DIVIDE-AND-CONQUER “)
SHARED-NOTHING VS. SHARED-DISK
![Page 16: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/16.jpg)
3 ways to add more CPU and storage
There are 3 ways to add more CPU power and storage : 1. Use larger server (more CPUs, RAM, I/O channels,)
1. Limited by the size of largest SMP server (128 cores, maybe 512 cores) 2. Can be expensive 3. HA and DR can be expensive
2. Divide data into many small partition s (MPP Shared Nothing or MPP S-N) 1. Add server (“node”) to “own” and process each data partition 2. Node = server+data “slice”: adding server requires adding storage and vice versa 3. Query has to be spread to every nodes 4. Results have to be collected and merged 5. Simple to implement, has some drawbacks
3. Many servers access shared data and process it (MPP Shared Disk or MPP S-D) 1. Optimal for indexed column DB because of low IO 2. Difficult to implement, smart and flexible to use 3. Suboptimal for row DB or scanning DB or storage HW filters : all need heavy IO 4. Server can be added without affecting storage 5. Storage can be added without affecting servers 6. Architectural HA : server crash does not affect data access
![Page 17: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/17.jpg)
MPP S-N (“shared nothing”)
17 1/17/2013 17
Server (A)
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
A
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
B
Server (B)
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
C
Server (C)
Server (D)
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
D
Server (E)
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
E
Add/remove node: significant time new node is “empty”, need redistribute data from other nodes Add storage: significant time must take data from other nodes Remove storage: hours/days needs to redistribute data to other nodes
![Page 18: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/18.jpg)
MPP S-D (“shared disk”)
Scalable performance and data, flexible, config
1/17/2013
DL980 (A,B,C,D,E)
FC Switch
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
A
DL980 (A,B,C,D,E)
DL980 (A,B,C,D,E)
DL980 (A,B,C,D,E)
DL980 (A,B,C,D,E)
Add/remove server: <1 min Add storage: <1 min Remove storage: < 1 min(*)
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
C
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
B
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
E
36 TB Array
36 TB Array
36 TB Array
36 TB Array
36 TB Array
D
![Page 19: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/19.jpg)
MPP : S-N vs. S-D
Current BI and Big Data: servers+storage are “sold together”
MPP S-D:
Small data,
Many CPUs
![Page 20: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/20.jpg)
MPP S-D (indexed) vs. MPP S-N
Flexibly combining storage and servers
MPP S-N
MPP S-D(I) High-CPU Low-data
MPP S-D (I) High-CPU High Data
MPP S-D (I) Low-CPU High Data
MPP S-D(I) Low-CPU Low-data
![Page 21: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/21.jpg)
MPP S-N , S-D, C-non-index and C-indexed Sybase IQ/EDMT 4XL (Full Rack)
MPP Shared Disk
2
160 Intel E7-4870
(2.4GHz) (No need for HW filter)
+100 TB/sec (indexed, no scan)
+30 TB/Hr
432 TB
+1,000 TB
96 racks (+500 custom)
15,360 (+700,000 custom)
On-line addition or removal of nodes Requires reorganization/repartitioning of Data with addition or removal of nodes
http://www.zdnet.com/blog/btl/emcs-launches-greenplum-appliance/40281
![Page 22: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/22.jpg)
MPP S-N – HANA (in-memory DB)
22 1/17/2013 22
Server (A)
Server (B)
Server (C)
Server (D)
Server (E)
![Page 23: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/23.jpg)
HANA (in-memory DB)
23 1/17/2013 23
![Page 24: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/24.jpg)
ADDING UNSTRUCTURED DATA TO BI,
STORING TBS, AND PBS OF DATA,
TEXT SEARCH
![Page 25: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/25.jpg)
Adding unstructured data to BI :
Load/Storage and Cross-Analysis
Problem 1: Load and Store
Load+store= too much for IT
1. Volume=Too Big: 100s of TB, multi-PB
2. Volume= Too Many: Billions & Trillions
3. Variety= too many diff data types
4. Velocity=Slow Load+Index of Data
5. Cost of Data Storage is high
Problem 2: Cross-analysis
No cross-analysis of SQL and Text
1. BI = only SQL analysis (no text)
2. Text analysis= only text, no SQL
3. No cross-analysis of SQL and Text data
(at large scale)
![Page 26: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/26.jpg)
Storing 1 PB in Hadoop (default config)
Hadoop 1 PB of data (default config) Hadoop node: 8 TB of data/ node (24 TB raw, w/ 3x copy) Node= 8-core Xeon, 16 GB RAM, 12x 2TB disks, 2RU = $4K HW= 125 nodes (6 racks), 3 PB raw, 1,000 disks = $500,000 Power= 125 kW (incl. A/C) = $109,500/year ($0.1/kWh) ~600 Tons of CO2 per year (=~120 cars )
Hadoop 1 PB
![Page 27: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/27.jpg)
Storing 10 PB in Hadoop (default config)
Hadoop 10 PB 1,200 servers, 12,000 disks, 60 racks, $5M ($4K/node), 1,200KW = $1.1 M /year in electricity (@ $0.10/kWh) ~6,000 Tons of CO2 per year (=~ 1,200 cars )
Hadoop 10 PB
![Page 28: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/28.jpg)
Hadoop SW
![Page 29: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/29.jpg)
OPERATIONS:
HA, B/R, DR, UPGRADES, LIFECYCLE
AND MORE
![Page 30: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/30.jpg)
HA, DR and Backup/Restore ?
1. HA and DR is tricky for MPP S-N 2. MPP S-D handle HA, failure and change easier, but need plan 3. Text Engines : HA, DR and B/R BI engines is a afterthought 4. Tapes? Not a good media, very slow,
Uptime downtime per year
![Page 31: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/31.jpg)
Specialists for HA/DR for MPP
1. Some of world’s largest DW use NonStopIQ 2. zero-downtime backup 3. Near-zero-downtime restore 4. Full DR and HA 5. Storage cost of $400/TB (HP P2000 MSA) opens new possibilities 6. Tapes? Should you even bother when storage costs is $400/TB ?
![Page 32: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/32.jpg)
Building Large BI since 2002
![Page 33: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/33.jpg)
DESIGNING THE “DREAM” BI
![Page 34: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/34.jpg)
Dream BI
1. Fast , scalable and flexible BI engine 1. Speed: query and data loading speed 2. Scales well with data volume, query complexity and #users 3. Flexible configuration : add/remove storage/server as needed 4. Compatibility with 3rd party enterprise reporting and anlytic tools
2. Integrates rich text search into BI queries 1. Easy and cost-free inclusion of text search into BI analytics 2. Fast loading of text data – without jeopardizing existing SQL data
3. Able to store large volumes of structured and unstructured data 1. “deep history” of SQL data and unstructured data
4. HA, DR, B/R, ACID, flexibility etc . 5. Price: affordable and comparable with Open Source
BI/DW Analytics
Text Search & Analytics
Big Data Store
(‘Archive’) + + =
Dream BI Solution
![Page 35: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/35.jpg)
EDMT SOLUTION
![Page 36: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/36.jpg)
Terminology
EDMT stands for.
Emails (any type of communications – email, SMS, skype..)
Documents (100s of file and doc formats)
Multimedia ( image, audio, video and more)
Transactions (“standard” DB records )
![Page 37: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/37.jpg)
EDMT Solution:
Pragmatic Approach to Data
Store Data
EDMT Solution stores emails, SMS,
Documents, Multimedia and DB
Transactions in RDBMS i.e. IQ for data
retention and mixed BI+text analysis
SQL+Text Analysis of All Data
EDMT cross-analyzes all data using
SQL+Text analysis to run Fraud Detection,
e-Discovery, CRM, Audit, GRC, BI etc.
10x, 100x or 1,000x faster than before
BI/DW Analytics
Text Search & Analytics
Big Data Store
(Archive) + + =
Dream BI Solution ?
![Page 38: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/38.jpg)
EDMT solves what others cannot
EDMT: Big Data 2.0
Innovating data technology
• Enables BI systems to store and analyze unstructured data
• Broad DB support
• Supports all DB architectures : • R=row-oriented • C = columnar • I = indexed • NI=not indexed • SD=MPP shared disk • SN=MPP shared nothing
• OS:
• Certified: Linux, HP-UX (incl. Poulson)
• Verified: AIX, Solaris, Windows
SAP Sybase
IQ (C-I-SD)
SAP Sybase
ASE (R-I-SD)
Oracle RAC
(R-I-SD)
Netezza (R-F-NI-SN)
Oracle Exadata
(R-F-I-SD)
MySQL (R-I)
SAP HANA
(RC-I-SN) (Q1 ‘13)
![Page 39: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/39.jpg)
2007: 1 Petabyte EDMT Solution
EDMT Big Data 2.0
• 1 PB of data (= 6 Trillion rows) loaded and indexed • Loading speed : 285 B rows per day (= 35 TB/day) • Load latency: < 2 sec • Pin-point search of 6 T rows = 0.5 sec • DB = Sybase IQ
![Page 40: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/40.jpg)
2012 : 1 PB + new HW = PB for masses
40-core DL 980
½ HALF Of
RACK EMPTY
• Same data capacity and speed as 2007 “1 PB “ 1. 1/15 in physical size, cost, electricity, weight 2. Deploys in 1 week 3. 288 TB of raw storage ( ~$115,000 $400/TB) 4. 40-core Xeon Linux server
• Price :
SW + HW = ~$500,000 Amount of data stored = 1,030 TB $/TB of data = ~$480/TB of data
![Page 41: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/41.jpg)
EDMT architecture
Innovating data technology
IQ, HANA, Oracle,
Exadata, Netezza, MySQL
DB Storage
EDMT
Server
EDMT
HW
Data Management, Access Control, Alerts, Auto-Classification,
Collaboration, Taxonomy, Data Retention, Connectivity, Search API ED
MT
AP
I &
Con
ne
cto
rs
E
D
M
T
ETL
(INGEST)
EDMT
Modules
Real-time ETL Parser, Metadata
Manager, Parallel Loader
ETL Storage
Linux, HP-UX Linux x86
ETL and Application Servers Database Servers
EDMT Data Access & Analysis Layer
EDMT GUI
Web Services
Data Export
Proxy Mobile
GUI eDiscovery, Audit,
Faud Modules
Social Net
Analysis
![Page 42: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/42.jpg)
2012: 1 Petabyte EDMT : for masses
• Out-of-the-box features of EDMT 1. Enterprise BI engines (IQ/HANA: SQL, ACID) 2. Connector for Business Objects, Cognos etc. 3. Complex data reporting and visualization 4. eDiscovery, Litigation hold, Audit, Compliance 5. Full-text, proximity, and dictionary search 6. FINRA post-review and random sampling workflow 7. Cross analysis of structured+unstructured data 8. Email+file archive, indexing & auto-categorization 9. Multimedia archiving, indexing, and auto-cat 10. DB record analytics and archiving 11. Retention, WORM and records management
• Price : SW + HW = ~$500,000 or ~$480/TB of data
BI/DW Analytics
Text Search & Analytics
Big Data Store
(Archive) + + = Dream BI Solution
40-core DL980
EMPTY ½
RACK
![Page 43: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/43.jpg)
EDMT systems
EDMT Big Data Appliance: Certified and Pre-Configured
• And beyond…
store + index index only
7 16K [ 96 racks ] 15,360 41,472 180 B 1,800 B 640 Tril l ion
6 4K [ 24 racks ] 3,840 10,368 48 B 480 B 160 Tril l ion
5 1K [ 6 racks ] 960 2,592 12 B 120 B 42 Tril l ion
4 4XL [ Full rack ] 160 432 2 B 20 B 7 Tril l ion
3 PB [ 1/2 rack ] 80 288 1.6 B 16 B 6 Tril l ion
2 XL [ 1/3 rack ] 40 144 600 M 6 B 2 Tril l ion
1 L [ 1/4 rack ] 24 72 300 M 3 B 1 Tril l ion
M [ 2 RU ] 12 36 150 M 1.5 B 500 B
S [ 2RU ] 6 36 150 M 1.5 B 500 B
XS [ 2RU ] 4 36 150 M 1.5 B 500 BOn
Lin
e
EDMT® Solution Models and Specifications
Model Description#
cores
Disk
Size (TB)
# emails & files (100KB each) DB rows
(150-byte)
Mid
Entry
High
Configuration Rule 1 Two or more EDMT® Solutions can be combined in one larger EDMT Solution
Configuration Rule 2 Storage can grow in 36TB increments ("Array", $14,000 or $400/TB )
![Page 44: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/44.jpg)
EDMT systems
EDMT Big Data Appliance: Certified and Pre-Configured
• Entry level – hardware valued at $125,000 to $250,000 (US List Price)
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
1.6B 16B 6Trillion
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
600M 6B 2Trillion
![Page 45: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/45.jpg)
EDMT systems
EDMT Big Data Appliance: Certified and Pre-Configured
• Mid level – hardware valued at $350K to $2,000,000 (US List Price)
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
12B 120B 42Trillion
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
2B 20B 7 Trillion
![Page 46: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/46.jpg)
EDMT systems
EDMT Big Data Appliance: Certified and Pre-Configured
• High level – hardware valued at $8M (US). 4x larger system 16K at $30M (US)
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
48B 480B 160 Trillion
![Page 47: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/47.jpg)
# email & files (store + index) (100KB each)
# email & files (index only) (100KB each)
DB rows (150-byte)
180B 1,800B 640 Trillion
EDMT systems
EDMT Big Data Appliance: Certified and Pre-Configured
• Highest level – EDMT supports up to 12,000 nodes
![Page 48: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/48.jpg)
Federated EDMT using IQ and HANA
+12 racks
More info about 1 PB HANA: http://www.saphana.com/community/blogs/blog/2012/11/12/the-sap-hana-one-petabyte-test
EDMT 1 PB
1 server/ 80 cores/1TB RAM
1/2 rack, 288 TB disks
~$500K (HW+SW) IQ
+ HANA ($TBD)
PB of raw data 6 Trillion rows Star schema Load 285 B rows/day Search 6 T rows = 0.5 sec 50 concurrent streams
HANA
disks
IQ (HP-UX
or DL980)
HANA 1 PB
switch
![Page 49: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/49.jpg)
EDMT: Federated IQ/HANA vs. Size/Speed
Speed
Data size
10,000
1000
100
10
1
EDMT @ IQ
EDMT @ HANA
EDMT @ HANA+IQ
Low
Med
High
Small
(< 100 TB)
Med
(100 TB - 1 PB) Large
+1 PB)
![Page 50: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/50.jpg)
Multi-site DR w/ NonStopEDMT (2010)
IQ 1
Server 3 - PowerExpress 520; AIX Internal: 10.26.51.62 [hqiq01] External:
IQ 2
IQ 3
Remote Site
Server 1 - Xeon 8-core; Linux; Internal: 10.26.51.61 [hqetl01]
External: 216.207.70.33 Ports: Smtp & Http
Server 2 - Xeon 8-core; Linux; Internal: 10.26.51.65 [hqetl02]
External: 216.207.70.32 Ports: Smtp & Http
SAN
Staging_1 Staging_2 Staging_3 Staging_4
EDMT 1
EDMT 2
Node 1 10.26.51.35 [hqatg05]
Node 2 10.26.51.36 [hqatg06]
Node 3 10.26.51.37 [hqatg07]
Remote EDMT 3
![Page 51: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/51.jpg)
Storing 1 PB in EDMT & Hadoop
EDMT 1 PB
(1/2 rack)
$450K
10kW ($9K/year)
Storage:
96 TB,
$20,090
$209/TB
8-core
Xeon
Server:
$1,172
16-core
Xeon
Server:
$4,860
![Page 52: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/52.jpg)
CUSTOMER SUCCESS STORIES
![Page 53: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/53.jpg)
EDMT Success Story 1:
Global Telecom & ISP (US)
Challenge Solution
SQL DW: Structured data (CDR, SMS,
Billing)
Step 1: Store 16 B CDR+SMS/day w/
EDMT = 85% of world’s SMS data
Step 2: - enable cross-correlation of CDR
data w/fully indexed text content
Benefit: create new services for +900
Telco carriers
![Page 54: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/54.jpg)
EDMT Success Story 2 :
University Research Clinic and Hospital
Challenge Solution
Email and file archive with text search Step 1: store, search and “retention” all
emails/SMS/IM with collaboration
Step 2: Add patient insurance payment
data and cross-analyze
Benefit: full 360-degree view of patients,
carriers and physicians
![Page 55: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/55.jpg)
EDMT Success Story 3 :
Taxation Office of European Country
Challenge Solution
SQL DW: DB consolidation project Step 1: Consolidate 30 years of 10 M
taxpayer SQL records
Step 2: Capture Audit data (emails,
voicemails, faxes etc.) with audited
taxpayers for Audit, Litigation and
Compliance purpose
Benefit: at ZERO cost, TaxOffice gets
360-degree customer view
![Page 56: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/56.jpg)
EDMT Success Story 4 :
EU Country Intelligence Agency
Challenge Solution
Email/SMS/IM archive and text search
Step 1: Load+cross-correlate huge
volumes of email/SMS to prevent cyber
crime, online attacks, web fraud, digital
threats. Loads +20TB data per day. Real-
time sub-sec Searches
Step 2: Store financial and travel data
(=SQL) and cross-correlate with emails,
SMS in real time. +1,000 TB (1 PB) in size
Benefit: previously impossible real-time
monitoring and actionable intelligence
![Page 57: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/57.jpg)
COMPARISONS AND
SIZING RULES
![Page 58: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/58.jpg)
Price-per-TB of User data (compressed)
EDMT Solution : Three-year Cost per COMPRESSED TB of User Data < $3,000
Download the entire document from:
ftp://public.dhe.ibm.com/software/data/sw-library/infosphere/analyst-reports/ITG-ISAS-Exadata-Teradata.pdf
![Page 59: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/59.jpg)
2012 : 1 PB + new HW = PB for masses
40-core Linux
½ HALF Of
RACK EMPTY
• Same data capacity and speed as 2007 “1 PB “ 1. 1/15 in physical size, cost, electricity, weight 2. Deploys in 1 week 3. 288 TB of raw storage ( ~$115,000 $400/TB) 4. 40-core Xeon Linux server HP DL980
• Price :
SW + HW = ~$500,000 Amount of data stored = 1,030 TB $/TB of data = ~ $480/TB of data
![Page 60: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/60.jpg)
EDMT 1 PB vs. Hadoop 1 PB (3c-def)
Hadoop 1 PB of data (default config) Hadoop node: 8 TB of data/ node (24 TB raw, w/ 3x copy) Node= 8-core Xeon, 16 GB RAM, 12x 2TB disks, 2RU = $4K HW= 125 nodes (6 racks), 3 PB raw, 1,000 disks = $500,000 Power= 125 kW (incl. A/C) = $109,500/year ($0.1/kWh) ~600 Tons of CO2 per year (=~120 cars )
EDMT 1 PB (1/2 rack) ~$500K
10kW ($9K/year)
50 tons CO2/year (~10 cars)
Hadoop 1 PB EDMT 1 PB
![Page 61: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/61.jpg)
AMAZON Cloud: 288 TB of storage (“PB”)
1. 4 monthly payments to for cloud storage may
pay for 288 TB of EDMT storage – the other 44
months (out of typ. 48 month HW cycle) are free
2. Savings could be significant
![Page 62: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/62.jpg)
EDMT 10 PB vs. Hadoop 10 PB
Hadoop 10 PB 1,200 servers, 12,000 disks, 60 racks, $5M ($4K/node), 1,200KW = $1.1 M /year in electricity (@ $0.10/kWh) ~6,000 Tons of CO2 per year (=~ 1,200 cars )
EDMT 10 PB
EDMT 10 PB (6 racks)
~$2M-$5M 100kW ($90K/year)
500 tons CO2/year
(~100 cars)
Hadoop 10 PB
![Page 63: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/63.jpg)
AMAZON Cloud storage for 10 PB
$300K/month for 3 PB of storage.
We sell 3 PB for $1.2 M
1. 4 monthly payments to for cloud storage may
pay for 3 PB of EDMT storage – the other 44
months (out of typ. 48 month HW cycle) are free
2. Savings could be significant
![Page 64: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/64.jpg)
Hadoop v1
![Page 65: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/65.jpg)
EDMT Million Channel Real Time Ingestor
![Page 66: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/66.jpg)
EDMT: store Hadoop data in EDMT for speed and SQL/ACID
HADOOP
HADOOP
HADOOP
![Page 67: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/67.jpg)
EDMT® vs. Google Search Appliance (GSA)
Dell.com
1. EDMT Solution can handle more data
2. GSA is more expensive “per document” than EDMT®
http://search.dell.com/results.aspx?s=gen&c=us&l=en&cs=&k=gb-7007&cat=all&x=7&y=6
EDMT® “L”
Google GSA
![Page 68: Present and Future of Enterprise BI - DAMA IndianaPresent and Future of Enterprise BI ... • DW/DM and ETL on Sybase IQ, HANA, Oracle, Exadata, Netezza , ... Row-DB vs. Columnar DB](https://reader033.vdocuments.site/reader033/viewer/2022052710/5aa47bd97f8b9a7c1a8c338d/html5/thumbnails/68.jpg)
BMMsoft – Services and products
1/17/2013
• Assessment of your current BI and Big Data situation
• Design of “Dream BI” to meet your future BI and Big Data needs
• EDMT Solution (on any DB supported platform)
• 2-hour consultation block