ramunas balukonis. research dwh

15
VISIT OUR BLOG: adform.com TWITTER: adforminsider Research of technologies for Big Data Analytics (2013-2014) 1 Ramūnas Balukonis, Adform

Upload: volha-banadyseva

Post on 10-May-2015

2.576 views

Category:

Software


0 download

DESCRIPTION

#BigDataBY

TRANSCRIPT

Page 1: Ramunas Balukonis. Research DWH

VISIT OUR BLOG: adform.comTWITTER: adforminsider

Research of technologies for Big Data Analytics

(2013-2014)

1

Ramūnas Balukonis, Adform

Page 2: Ramunas Balukonis. Research DWH

Our impressions growth

3

Now 2 blns transaction or 1,4 TB per day

(RAW)

2012 we started to research for technology to

process, load and provide data for analytics

0

50

100

150

200

250

300

350

400

450

500

2001 2002 2003 2004 2005 2006 2007 2008 2009 2010 2011 2012 2013 2014

Impressions Per Year, BLNS of ROWS

Page 3: Ramunas Balukonis. Research DWH

Where we are now

4

Page 4: Ramunas Balukonis. Research DWH

DWH – our needs for Big Data Analytics

5

Query performance up to moments

No downtime window

Short time to market

Near real time latency

No backups

Unattended scaling

Inessential data loss and data discrepancies

Page 5: Ramunas Balukonis. Research DWH

6

Page 6: Ramunas Balukonis. Research DWH

How we tested

7

Testing takes up 3 month for each technology to

finish test

Testing env: 3X (24 Cores + 96 GB RAM + 800

GB RAID10)

Loaded 5 TB of data (non compressed data)

Page 7: Ramunas Balukonis. Research DWH

Candidates for BIG Data Analytics

8

Page 8: Ramunas Balukonis. Research DWH

IBM Netezza

9

Appliance: no commodity HW

No elastic scale out

Global presence, sales, delivery and support.

Page 9: Ramunas Balukonis. Research DWH

HP Vertica

10

Elastic scale out

Brilliant performance (Load/Select)

No stored procedures

No UI

Price per TB

Page 10: Ramunas Balukonis. Research DWH

SAP Sybase IQ

11

Scaling using shared disk

Similar to MS SQL (tools, logic, stored procs,

system views and SP, BOL similar)

Concerns about easy of implementation and

use

Price per core

Page 11: Ramunas Balukonis. Research DWH

Amazon Redshift

12

Price – the only player we tested that provides

prices online

Filters impact on query performance badly

Cluster resize/scaling

Unstable connection

Page 12: Ramunas Balukonis. Research DWH

Calpont InfiniDB

13

Shared nothing

MySQL as front end – tools, connectors,

procedures etc.

Community (offers prebuild solutions) or EE

Super fast load

Relatively slow query perf

Slow insert/update/delete

Page 13: Ramunas Balukonis. Research DWH

Where we are now

15

Page 14: Ramunas Balukonis. Research DWH

What we learned

Number of suitables technologies drops whenTBs increses

Adopt technology to your requirements and notvice versa

No Silver Bullet: Queries vs row store – 10X

Load speed vs row store – 4X

Compression vs row store – 4X

... And we‘ll learn much more after we‘ll run ourfirst report

16