1. if ramayana can be reduced to one shlok….then can’t i complete covering “spatial big data...
TRANSCRIPT
1
If ramayana can be reduced
to one shlok….then
can’t I Complete covering
“SPATIAL Big DATA &
SECURITY “ IN 15 MIN ?
Do lafzon ki hai DATA ki kahani...............
Ek hai ZERO....duja hai ONE.....
3
Big Data
Security
WELCOME 4
SPATIAL BIG DATA has been with us for ages in various forms…but pretty invisible!!
5
6
Ancient Egypt
River nile
Engineers used to try data analysis to predict crop yields
LOGS & LEVEL
6695 Km long
Basic Intro
Concepts
Perceptions
Challenges
…the 15 min route to THANK YOU slide7
An English professor wrote the words :
“A Woman without her man is nothing”
On the chalk board and asked his students to punctuate it correctly….
“A Woman,without her man,is nothing”.
“A Woman: Without her, man is nothing”8
DEFINING BIG SPATIAL DATA 9
How we understand it ?
Spatial data sets exceeding capacity of current computing systems……
….to manage, process or analyze the data with reasonable effort
due to Volume, Velocity, Variety and Veracity
DEFINING BIG SPATIAL DATA
BIG SPATIAL DATA
10
11
Volume Velocity VARIETY
Veracity
DEFINING BIG SPATIAL DATA
BIG SPATIAL DATAFinding actionable info in Massive volumes of both structured and unstructured geo data that is so large and complex that it’s difficult to process with traditional database and software techniques……
Volume
Velocity
VARIETY
VERACITY
Data at rest
Data in Motion
Data in Manyforms
Data in Doubt
12
90% of data in the world was created in the last 2 years
3 EB of data is created
every day
U.S. drone aircraft sent back 35 years
worth of video footage in 2012
Gigabyte (GB) - 1,024MBTerabyte (TB) - 1,024GBPetabyte (PB) - 1,024TBExabyte (EB) - 1,024PB
* Estimated revenue FY 2013
growth of geospatial data is outpacing both software and services and is set to become a major contributor to the overall growth of the industry
14
The bad things in life open your eyes to the good things you weren’t paying attention to
beforeSECURITY100% security is a mythNo one has said this!!!
But it remains a fact
15
Increasing attack surface
The technology is ready….
But are we ready ?
16
1717
DISASTER RELIEF
FINANCIAL
FRAUD DETECTION
CALL CENTER REQUESTS
DISEASE SURVEILLANCE
INSURANCE
RETAIL
TELECOMMUNICATIONS
UTILITIES
ECO-ROUTING
The otherof the
side story
18
Security challenges before we adopt spatial Big data
19
Distributed programming frameworks
Ek
20
Distributed programming frameworks
Input fileMap Intermediate
Combining Shuffle Output File
Local Reduce Reduce
Mapper performs computation& outputs a key/value pairs
21
Reducer combines the values belonging to each distinct key and outputs the result
Utilise parallelism in computation & storage to process massive amounts of data
MAP REDUCE
FRAMEWORK
Splits the input data-set into independent chunks which are processed
in a completely parallel manner
Aggregate results from map phase
performs a summary operation
Schedules and re-runs tasks
Splits the input
Moves map outputs to reduce inputs
Receive the results
Distributed programming frameworks 22
So challenge is not storage but it is I/O speed
One Machine
4 i/o ChannelsEach channel : 100 MB/s
10 Machine’s
4 i/o ChannelsEach channel : 100 MB/s
Read 1 TB
45 Min 4.5 Min
Untrusted Mappers
Securing the data in the presence of an untrusted mapper
Distributed programming frameworks 24
NO SQL ISSUES
TWO
25
26
First off : the name
NoSQL is not “NEVER SQL”
NoSQL is not “No To SQL “
27
NoSQL
Is simply
Not Only SQL!!!!!
MongoDB
Redis28
NoSQL DB are still evolving with
respect to security infrastructure
Data storage & transaction logs
29
STORAGE TIERS
- Multi-tiered storage media
-Necessitated by scalability, availability & the growth ie exponential
-Different categories of data-Different types of storage
Data storage & transaction logs 30
Lower tier means reduced security, loose access controls
Keeping track of data location
Data storage & transaction logs 31
32
How can we trust data ?
Validating data when source of input data is not reliable?
Filtering malicious data @ BYOD
Input validation/filtering 33
REAL TIME MONITORING
34
Humongous number of alerts!!!!
False positives
Filtering malicious data @ BYOD
REAL TIME MONITORING 35
Secure communication
36
End to end security ?
Data encryption : attribute based encryption!!!to be made richer
Secure communication 37
Granular audits
38
New attacks will keep happening…and to find out we need detailed audit logs
Missed true positives
Granular audits 39
PRIVACY ISSUES
40
EG : How a retailer was able to identify that a teenager was pregnant before her father knew
41PRIVACY ISSUES
In the world of big data,privacy invasion is a business model
And...
We Also Have cloud with us?
42
At 1.4% in 2011-12 Cloud was a very small percentage of the total IT spend
43
Pace of Big Spatial Data adoption has been
Sluggish
44
45
There is unlikely to be a day soon in near
future when we have a
“FIND TERRORIST”
BUTTON
46
We have mostly been reactive till
date…..
47
USE KERBEROS FOR NODE AUTHENTICATION – (BUT WE KNOW IT’S A PAIN TO SET UP)
STRINGENT POLICIES
STANDARD TO INTRA COUNTRY LAWS
EXHAUSTIVE LOGS
SECURE COMMUNICATION
STRINGENT POLICIES
48
This presentation reflected the personal views and opinions in my individual capacity only. It does not represent the views and opinions of my organization or anyone else, and is not sponsored or endorsed
by them in any way. This is an individual presentation.
DISCLAIMER