picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... ·...

25
www.univa.com Picture credit in notes

Upload: others

Post on 05-Aug-2020

1 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com Picture credit in notes

Page 2: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com

Earthquake Damage

▪ Destructive tsunamis occur frequently—about one a year.

▪ There have been 94 destructive tsunamis in the last hundred years.

▪ There have been 51,000 victims (not including Dec. 26, 2004).

▪ Future tsunami disasters are inevitable.

▪ Growing human population in low-lying coastal areas.

▪ Education about tsunamis can save many lives.

Earth: Portrait of a Planet, 5th edition, by Stephen Marshak © 2015 W. W. Norton & Co.

Page 3: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com

Ian Lumb

Solutions Architect

GTC 2017 – San Jose

May 9, 2017

Mitigating Disasters

with GPU-Based Deep

Learning from Twitter?

Page 4: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com

4

Tsunamis

Earthquake-Tsunami Causality

Deep Learning from Twitter?

Deep Meaning from Twitter???

Other Disasters

Discussion

Outline

Page 5: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Geist, E.L., Titov, V.V., and Synolakis, C.E., 2006, Tsunami: wave of change: Scientific American, v. 294, p. 56-63

Shocking Differences

www.univa.com

Page 6: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com

Tsunami Advisories

6

Page 7: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Motivation

▪ Non-deterministic cause

▪ Uncertainty inherent in any attempt to predict earthquakes

o In situ measurements may reduce uncertainty

▪ Lead times

▪ Availability of actionable observations

▪ Communication of situation - advisories, warnings, etc.

▪ Cause-effect relationship

▪ Energy transfer - inputs ... coupling ... outputs

o ‘Geometry’ - bathymetry and topography

▪ Other factors - e.g., tides

▪ Established effect

▪ Far-field estimates of tsunami propagation (pre-computed) and coastal

inundation (real-time) have proven to be extremely accurate ...

requires– Distributed array of deep-ocean tsunami detection buoys + forecasting model

htt

p:/

/cre

dit

.pva

mu

.ed

u/M

CB

DA

20

16

/Slid

es/D

ay2

_Lu

mb

_MC

BD

A1

_Tw

itte

r_Ts

un

ami.p

df

www.univa.com

Page 8: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

http://www.gitews.org/en/concept/

Traditional Data Sources

www.univa.com

Page 9: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Deep Learning from Twitter?

http://credit.pvamu.edu/MCBDA2016/Slides/Day2_Lumb_MCBDA1_Twitter_Tsunami.pdfwww.univa.com

Page 10: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Big Data’s 6Vs

10

htt

p:/

/cre

dit

.pva

mu

.ed

u/M

CB

DA

20

16

/Slid

es/D

ay2

_Lu

mb

_MC

BD

A1

_Tw

itte

r_Ts

un

ami.p

df

www.univa.com

Page 11: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Acquires tweets with the keyword “earthquake”

use Net::Twitter::Lite::WithAPIv1_1;

my $nt = Net::Twitter::Lite::WithAPIv1_1->new(

consumer_key => 'xxxx...xxxxxxx',

consumer_secret => 'xxxxxx.....xxxxxxxxxx',

access_token => 'xxxxx....xxxxxxxxxxx',

access_token_secret => 'xxxxx.....xxxxxxxxxxx',

ssl => 1

);

my $result = $nt->search("earthquake");

for my $status(@{$result->{statuses}} ) {

print "$status->{text}\n";

}

http://credit.pvamu.edu/MCBDA2016/Slides/Day2_Lumb_MCBDA1_Twitter_Tsunami.pdf

Perl Script Prototype

www.univa.com

Page 12: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Deep Learning Workflow

12

After Karau et al., Learning Spark, O’Reilly, 2015

www.univa.com

Page 13: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Deep Learning from Twitter?

Represent data

▪ Twitter data manually curated into ‘ham’ and ‘spam’

▪ In-memory representation via Spark RDDs

Extract features

▪ Frequency-based usage via Spark MLlib HashingTF

⇒ feature vectors

Develop model object

▪ Spark MLlib LogisticRegressionWithSGD used for

classification

Evaluate model

http://credit.pvamu.edu/MCBDA2016/Slides/Day2_Lumb_MCBDA1_Twitter_Tsunami.pdf

www.univa.com

Page 14: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Spark Prototype

www.univa.com

Page 15: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Next Steps: Scaling …

15

OUTIN

DOWN

UP

www.univa.com

Page 16: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

17

PyTorch

▪ Python package that provides

▪ Tensor computation – strong GPU acceleration, efficient memory usage

o Integrated with NVIDIA CuDNN and NCCL libraries

▪ Deep Neural Networks built on a tape-based autograd system

▪ Can leverage numpy, scipy and Cython as needed

▪ Available tutorials include Natural Language Processing (NLP)

▪ Revisited text classification via Bag-of-Words

http://pytorch.org/about/www.univa.com

Page 17: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

PyTorch BoW Classifier

18

http://pytorch.org/tutorials/beginner/deep_learning_nlp_tutorial.htmlwww.univa.com

Page 18: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Towards Deep Meaning …

▪ A feature vector is a feature vector - it is devoid of semantics

▪ The W3C’s Web Ontology Language (OWL) accounts for domain

specifics - disambiguates use of overloaded terms (e.g.,

“earthquake”) in different contexts (e.g., geophysics vs. movies vs.

…)

www.univa.com

Page 19: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

20

PyTorch

▪ Python package that provides

▪ Tensor computation – strong GPU acceleration, efficient memory usage

o Integrated with NVIDIA CuDNN and NCCL libraries

▪ Deep Neural Networks built on a tape-based autograd system

▪ Can leverage numpy, scipy and Cython as needed

▪ Available tutorials include Natural Language Processing (NLP)

▪ Revisited text classification via Bag-of-Words

▪ Investigating word embeddings to expose semantic similarity

http://pytorch.org/about/www.univa.com

Page 20: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

21

Word Embeddings for Semantic Similarity

▪ “… words appearing in similar contexts are related to each other

semantically.” (Guthrie, PyTorch NLP tutorial)

▪ Could word embeddings disambiguate use of terms (e.g.,

“earthquake”) in different contexts (e.g., geophysics vs. movies vs.

…)???

After Goodfellow et al., 2016www.univa.com

Page 21: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Towards Deep Meaning (Revisited) …

▪ A feature vector is a feature vector - it is devoid of semantics

▪ Ignores inherent, overall credibility of a Tweet - e.g., as quantified by

TweetCred

▪ Twitter metadata (handles, hashtags and URLs) contributes equally

to Twitter data (unstructured text that comprises the body of a

Tweet) in constructing feature vectors - i.e., the semantic value of

Twitter metadata is also ignored by Deep Learning

▪ The W3C’s Resource Description Framework (RDF) facilitates the

representation of metadata and thus exposes semantics

▪ The W3C’s Web Ontology Language (OWL) accounts for domain

specifics - disambiguates use of overloaded terms (e.g.,

“earthquake”) in different contexts (e.g., geophysics vs. movies vs.

…)

▪ Deep Learning in combination with RDF/OWL semantics has the

potential to produce learned models with knowledge represented

www.univa.com

Page 22: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Discussion

▪ Credible tweets could be transformative - Big Data source that can

complement traditional sources (e.g., scientific instruments)

▪ Working with 6V Twitter data can be challenging, though it also

presents interesting opportunities

▪ Curation of training data is extremely important, but also extremely

time consuming (as this is a manual process)

▪ Current research emphasizes Deep Learning, BUT RDF/OWL

semantics will need to play a role ultimately

▪ Approach can be genericized for application to natural and

anthropogenic disasters of all kinds

www.univa.com

Page 23: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Univa Confidential 24

Acknowledgements

Collaborator: James Freemantle

Page 24: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

www.univa.com

THANK YOUIan Lumb

Solutions Architect

+1 647 478-5901 x 110 [email protected]

Page 25: Picture credit in noteson-demand.gputechconf.com/gtc/2017/presentation/s7611-ian-lumb... · Earthquake Damage Destructive tsunamis occur frequently—about one a year. There have

Accounting for Oil Spills and more …

▪ Energy exploration via reflection seismology provides the

fundamental source of data that is subsequently processed and

interpreted for the identification of potential petroleum reservoirs

▪ Reservoir simulation is used to engineer the extraction of petroleum

reserves from reservoirs

▪ Drilling is used to ‘truth’ the results provided by interpretations and

simulations prior to production extraction

▪ SOPs ensure extraction of oil from a production reservoir is routinely

monitored and reported upon - e.g., to quantify rig safety and output

(barrels/day)

▪ From exploration to extraction, this is a data-rich workflow

▪ Additional data sources become relevant when disasters occur (e.g.,

oil spills) - from re-purposed scientific instruments (e.g., weather

satellites) to social media (e.g., Twitter, Instagram, Snapchat, ...)

▪ Data-rich workflows can generate problems in Big Data Analytics

www.univa.com