australian cio summit 2012: big data, new physics, and geospatial super-food by tristan sternson,...
DESCRIPTION
Australian CIO Summit 2012: Big Data, New Physics, and Geospatial Super-Food by Tristan Sternson, Managing Director, InfoReadyTRANSCRIPT
Big Data, New Physics, and
Geospatial Super-Food
© 2012 Infoready Pty Ltd1
Tristan Sternson, InfoReady Managing Director
My Background – Tristan Sternson
� Past 12 years focussed purely on IM / BI Solutions
� Started InfoReady in 2008
� Prior Roles – Accenture Data Management & Architecture / IM Lead, PWC Consulting / IBM
� Personally designed and deployed and led many large IM and
© 2012 Infoready Pty Ltd2
� Personally designed and deployed and led many large IM and DW application in Australia and UK
� Thought leader in Information Management in Australia and APAC
� Early adopter Data Governance, Big Data, Industry Data Models, Appliance DW solutions
Who is InfoReady?
� Pure-Play Information Management and Business Intelligence Consulting firm
� Team InfoReady career IM and BI Experts
• One of the fastest growing consulting firms in Australia.
© 2012 Infoready Pty Ltd3
Australia. • IM Focused Tier One Consulting capability.
• Focus - people, process and technology
• Assisting companies turn valuable information into actionable intelligence.
• Strategy, Architecture, Solution Design & Delivery
Big Data Definition
Datasets that grow so large that they become difficult to work with, including; capture, storage, search, sharing, analytics, and visualization.
Benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime.”
© 2012 Infoready Pty Ltd4
Benefits of working with larger and larger datasets allowing analysts to "spot business trends, prevent diseases, combat crime.”
We haven’t seen anything yet, as more devices come online, eg;mobile, airborn, logs, cameras, microphones etc…
Wikipedia - 2012
The Big Data Opportunity
V3
© 2012 Infoready Pty Ltd5
Big Data – Why the hype?
� By 2015, nearly 3B people will be online, pushing the data created and shared to nearly 8 zettabytes.
� 30 billion pieces of content were added to Facebook this past month by 600M plus users.
� More than 2B videos were watched on YouTube … yesterday.
© 2012 Infoready Pty Ltd6
� More than 2B videos were watched on YouTube … yesterday.
� In the US mobile phone users between the ages of 18 and 24 send an incredible 110 text messages per day.
� 32B searches were performed last month … on Twitter.
� Worldwide IP traffic will quadruple by 2015.
Business leaders frequently make decisions based on information they don’t trust, or don’t have
1in3
Business leaders say they don’’’’t have access to the information they need to do their jobs
1in2
Business Value
© 2012 Infoready Pty Ltd7 7
83% of CIOs cited “Business intelligence & Analytics” as part of their visionary plans to enhance competitiveness
do their jobs
of CEOs recognise they need to better understand information more rapidly in order to make swift decisions
60%
Big Data Trends
© 2012 Infoready Pty Ltd8
80%20%
What the Industry Analysts say
Gartner predicts Big Data to be one of the top-10 strategic initiatives
for 2012
© 2012 Infoready Pty Ltd9
for 2012
What the Industry Analysts say
Key take-aways from Analyst perspectives Gartner TDWI
Data will grow exponentially� �
Fusion of structured and unstructured data� �
© 2012 Infoready Pty Ltd10
The connection between big data and advanced analytics will get even stronger � �
Future users will not be able to put all useful information into a single data warehouse � �
Enterprise Intelligencevs. Enterprise Amnesia
© 2012 Infoready Pty Ltd11
Com
puting
Pow
er Gro
wth
Available Observation
Space
Context
Trend: Organizations Are Getting Dumber
EnterpriseAmnesia
© 2012 Infoready Pty Ltd12
Time
Com
puting
Pow
er Gro
wth
Sensemaking Algorithms
Available Observation
Space
ContextWHY?
Trend: Organizations Are Getting DumberCom
puting
Pow
er Gro
wth
© 2012 Infoready Pty Ltd13
Time
Sensemaking AlgorithmsC
ompu
ting
Pow
er Gro
wth
Algorithms at Dead End.
You Can’t Squeeze Knowledge
© 2012 Infoready Pty Ltd14
You Can’t Squeeze Knowledge
Out of a Pixel.
Context, definition
Better understanding something by taking into
© 2012 Infoready Pty Ltd16
Better understanding something by taking into account the things around it.
Information in Context … and Accumulating
Job Applicant
© 2012 Infoready Pty Ltd17
Top 200Customer
Job Applicant
IdentityThief
CriminalInvestigation
The Puzzle Metaphor
� Imagine an ever-growing pile of puzzle pieces of varying sizes, shapes and colors
� What it represents is unknown – there is no picture on hand
� Is it one puzzle, 15 puzzles, or 1,500 different puzzles?
© 2012 Infoready Pty Ltd18
� Some pieces are duplicates, missing, incomplete, low quality, or have been misinterpreted
� Some pieces may even be professionally fabricated lies
� Until you take the pieces to the table and attempt assembly, you don’t know what you are dealing with
Puzzling
12 pieces
100%
1000 pieces
100%
12 pieces100%
100 pieces10% (duplicates)
© 2012 Infoready Pty Ltd19
100%
66 pieces
66%
100%(pure noise)
© 2012 Infoready Pty Ltd20
© 2012 Infoready Pty Ltd21
First Discovery – “we found Dora?”
© 2012 Infoready Pty Ltd22
Sorting Algorithm
© 2012 Infoready Pty Ltd23
Another Puzzle …
© 2012 Infoready Pty Ltd24
10 Mins – Completed Dora Puzzels
© 2012 Infoready Pty Ltd25
Data Finds Data
© 2012 Infoready Pty Ltd26
Obvious Duplicates in Front Of Your Eyes
© 2012 Infoready Pty Ltd27
Incremental Context – Incremental Discovery
10:00am START
< 1min “I can see Dora”
1min “How many puzzles are there?”
8min “Are there 1000 pieces and 3 or 4 puzzles?”
© 2012 Infoready Pty Ltd28
10min 2 x Dora puzzles complete
12min “I have blue sky and an animal”
18mins “The other puzzle is more colourful – maybe a red
motorbike”
23min “we’ve found Jenny Sanders – can I search google on my
iPhone for the picture?”
35min “How can we have 2 pieces the same?”
Lots of Sorted Pieces
© 2012 Infoready Pty Ltd29
Pieces in Context
© 2012 Infoready Pty Ltd30
Quickly we find meaning (90mins)
66 piecesof
1190 piecesonly 5.5%
© 2012 Infoready Pty Ltd31
Wow 1%
11 piecesof
1190 piecesonly 1%
© 2012 Infoready Pty Ltd32
Koala, Possum or Monkey?
© 2012 Infoready Pty Ltd33
Foundation
© 2012 Infoready Pty Ltd34
More Data Finds Data
© 2012 Infoready Pty Ltd35
Out of Tablespace…
© 2012 Infoready Pty Ltd36
Incremental Context – Incremental Discovery
55min “Second puzzle is definitely a motorbike – I can see a wheel and seat”
65min Motorcycle coming together very quickly
70min “It’s definitely a koala”
75min “The koala has a baby”
© 2012 Infoready Pty Ltd37
83min “The middle piece of the bike is missing – do I really need it, I know what it is”
88min “These are both Australian puzzles”
114min One of the kids starts isolating pieces that are causing her “noise”
130min 7 chunks emerge from 7 piles of SORTED pieces
165min Pieces beginning to come together quite quickly and picture starts to really emerge
How Context Accumulates
� With each new observation … one of three assertions are made: 1) Un-associated; 2) placed near like neighbors; or 3) connected
� Must favor the false negative
� New observations sometimes reverse earlier assertions
© 2012 Infoready Pty Ltd38
� Some observations produce novel discovery
� As the working space expands, computational effort increases
� Given sufficient observations, there can come a tipping point
� Thereafter, confidence improves while computational effort decreases!
Uniqu
e Ident
ities
Overstated Population
© 2012 Infoready Pty Ltd39
Observations
Uniqu
e Ident
ities
True Population
Counting Is Difficult
Mark Smith6/12/1978
Mark R Smith(614) 13-123-123DL: 00001234
© 2012 Infoready Pty Ltd40
6/12/19780413123123
File 1
File 2
Uniqu
e Ident
ities
The Rise and Fall of a Population
© 2012 Infoready Pty Ltd41
Observations
Uniqu
e Ident
ities
True Population
Data Triangulation
Mark Smith6/12/1978
Mark R Smith(614) 13-123-123DL: 00001234
New Record
© 2012 Infoready Pty Ltd42
6/12/19780413123123
File 1
File 2
Mark Randy Smith0413123123
DL: 00001234
Big Data [in context]. New Physics.
�More data: better the predictions– Lower false positives
– Lower false negatives
© 2012 Infoready Pty Ltd43
�More data: bad data good– Suddenly glad your data is not perfect
�More data: less compute
Big Data
© 2012 Infoready Pty Ltd44
Pile of ____ In Context
One Form of Context: “Expert Counting”
� Is it 5 people each with 1 account … or is it 1 person with 5 accounts?
� Is it 20 cases of H1N1 in 20 cities … or one case reported 20 times?
© 2012 Infoready Pty Ltd45
case reported 20 times?
� If one cannot count … one cannot estimate vector or velocity (direction and speed).
�Without vector and velocity … prediction is nearly impossible.
Expert Counting: Degrees of Difficulty
IncompatibleFeatures
Deceit
Bob Jones123455
Ken Wells550119
© 2012 Infoready Pty Ltd46
Exactly Same
Fuzzy
Bob Jones123455
Bob Jones123455
Bob Jones123455
Robert T Jonnes000123455
Bob Jones123455
bjones@hotmail
Key Features Enable Expert Counting
People Cars Router
Name Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.
© 2012 Infoready Pty Ltd47
Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.
Consider Lying Identical Twins
#123Sue3/3/84UberstanExp 2011
PASSPORT#123Sue3/3/84UberstanExp 2011
PASSPORT
© 2012 Infoready Pty Ltd48
Fingerprint
DNAMost Trusted
Authority
“Same person –
trust me.”
Most TrustedAuthority
�The same thing cannot be in two places … at the same time.
Two different things cannot
© 2012 Infoready Pty Ltd49
�Two different things cannot occupy the same space … at the same time.
Space & Time Enables Absolute Disambiguation
People Cars Router
Name Make Device IDAddress Model MakeDate of Birth Year ModelPhone License Plate No. Firmware Vers.Passport VIN Asset IDNationality Owner Etc.
When When WhenWhere Where Where
© 2012 Infoready Pty Ltd50
Passport VIN Asset IDNationality Owner Etc.Biometric Etc.Etc.
“Life Arcs” Are Also Telling
Bill Smith13/4/67
Melbourne, Victoria
Bill Smith13/4/67
Brisbane, Queensland
Address History Address History
© 2012 Infoready Pty Ltd51
Address History
Melbourne, Vic 2008-2008
St Kilda, Vic 2005-2008
Hampton, Vic 1996-2005
Brighton, Vic 1984-1996
Address History
Carina, QLD 2005-2009
Brisbane, QLD 2005-2005
Bondi, NSW 1990-2005
Carina, QLD 1982-1990
OMG
© 2012 Infoready Pty Ltd52
Space-Time-Travel
� Cell phones are generating a staggering amount of geo-locational data – 600B transactions per day being created in the US alone
� This data is being “de-identified” and shared with third parties – in volume and in real-time
© 2012 Infoready Pty Ltd53
third parties – in volume and in real-time
� Your movement quickly reveals where you spend your time (e.g., evenings vs. working hours)
� Re-identification (figuring out who is who) is somewhat trivial
Powerful Predictions
� Prediction with 87% certainty where you will be next Thursday at 5:35pm
�Names of the top 10 people you co-locate with, not at home and not at work
© 2012 Infoready Pty Ltd54
not at home and not at work
� Intelligence service preempts the next mass protest in real-time
� Robbery of a convenience store is about to happen at 10:42pm
Consequences
�Space-time-travel data is the ultimate biometric
� It will enable enormous opportunity
© 2012 Infoready Pty Ltd55
� It will unravel one’s secrets
� It will challenge existing notions of privacy
�And, it’s here now and more to come
Macro Trends
© 2012 Infoready Pty Ltd56
Value
of Dat
aThe Greater the Context, the Greater the Value
Data in Context
© 2012 Infoready Pty Ltd57
Value
of Dat
a
Pile of Data
Records Managed(Big) (Ludicrous Big)
Willing
ness
to W
ait
The better the predictions … the faster they will be
wanted.
“Why did we have to wait until the
end of the day for the smart answer?”
Time Is Of The Essence
Day
Hour
Batch
© 2012 Infoready Pty Ltd58
Willing
ness
to W
ait
the smart answer?”
Relevance (Iffy) (Totally)
200ms Real-Time
Enterprise IntelligenceOne Plausible Journey
Enterprise IntelligenceOne Plausible Journey
© 2012 Infoready Pty Ltd59
ObservationSpace
Sense and Respond
New
© 2012 Infoready Pty Ltd60
What you know
New Observations
ObservationSpace
Data Finds Data
Sense and Respond
© 2012 Infoready Pty Ltd61
Decide
?Relevance
Finds the Sensor(<200ms)
Data Finds Data
Explore and Reflect
ObservationSpace Deep
Reflection
CuratedData
PatternDiscovery
Data Finds Data
Sense and Respond
© 2012 Infoready Pty Ltd62
Decide
?
DirectedAttention
Relevance Find You
PatternDiscovery
RelevanceFinds the Sensor
(<200ms)
Data Finds Data
ObservationSpace Deep
Reflection
CuratedData
PatternDiscovery
Data Finds Data
Explore and ReflectSense and Respond
© 2012 Infoready Pty Ltd63
Decide
?
DirectedAttention
NEWINTERESTS
PatternDiscovery
RelevanceFinds the Sensor
(<200ms)
Data Finds Data
ObservationSpace Deep
Reflection
CuratedData
PatternDiscovery
Data Finds Data
Explore and ReflectSense and Respond
© 2012 Infoready Pty Ltd64
Decide
?
DirectedAttention
NEWINTERESTS
PatternDiscovery
RelevanceFinds the Sensor
(<200ms)
Data Finds Data
Report and Manage
Closing Thoughts
© 2012 Infoready Pty Ltd65
The most competitive organizations
are going to make sense of what they are observing
fast enough to do something about it
© 2012 Infoready Pty Ltd66
fast enough to do something about it
while they are observing it.
Available Observation
Space
Context
Wish This On The Enemy
EnterpriseAmnesia
Com
puting
Pow
er Gro
wth
© 2012 Infoready Pty Ltd67
Time
Sensemaking AlgorithmsC
ompu
ting
Pow
er Gro
wth
The Way Forward: Enterprise Intelligence
Available Observation
Space
Context
Com
puting
Pow
er Gro
wth
© 2012 Infoready Pty Ltd68
Time
Sensemaking AlgorithmsC
ompu
ting
Pow
er Gro
wth
Questions?
© 2012 Infoready Pty Ltd69
Email: [email protected]
Twitter: http://www.twitter.com/tsternson
Blog: www.infoready.com.au
LinkedIn: http://www.linkedin.com/in/tristansternson