real time api delivering data @ scale

13
Akash Mishra Real Time API delivering Data @ Scale

Upload: akash-mishra

Post on 17-Jul-2015

50 views

Category:

Software


0 download

TRANSCRIPT

Akash Mishra

Real Time API delivering Data @ Scale

Agenda

API Overview

Key System Requirement

Big Data System Vs RDBMS

Architecture

Data Flow

Questions?

API Overview

API details

REST based API

Partners can request for various types of reports

Each reports has data in order of T.B's

Sample Request

?start-date=2012-10-01&end-date=2012-10-29&partner=1&aggregate-by=state,city

Response

Zip file [Size in order of 10-30 M.B]

Key System Requirement

Interactive Filtering Query– Partner can filter data on various parameter.

Real Time Response– SLA of 1-3 min.

Security

Extremely private and confidential data.

Need to go through an audit by external vendor

Scalability

Only more machine for more customer

Big Data System Vs Relational Data System

Large Amount of Data [In order of T.B's ]

Hadoop/Hive

RDBMS

Real Time Interactive Filtering/Querying

Hadoop/Hive

RDBMS

Join's between large tables [ millions X millions X millions ]– Hadoop/Hive– RDBMS

Big Data System Vs Relational Data System

Access/Security Control

Hadoop/Hive

RDBMS

Resilient to Hardware failure and Auto Scaling

Hadoop/Hive

RDBMS

Fast read operation's– Hadoop/Hive– RDBMS

Architecture

Data Flow

De-normalization on Hadoop/Hive

Time: 3hrs

#Records: 230m

Data Flow

Dynamic partitioning on Hadoop/Hive

# Buckets 15

#Records: 230m

Data Flow

Sqoop Export

#Records: 230m

Size: 1 T.B

Data Flow

Security Control in RDBMS

Strong User authentication mechanism.

Restricted access to each user on database and table level

Each partner has specific user and associated tables

No cross-referencing of data across [table] partner.

Data Flow

Java API

Common Pattern [Streaming]• Read a bunch of records from DB.• Process records.• Stream back to client.

Avoiding creating unnecessary objects• Java heap memory exception because of using String in

place of Char Array.

Questions???