solving geophysics problems with python - speaker notes

29
Slide 1 SOLVING GEOPHYSICS PROBLEMS WITH PYTHON PAIGE BAILEY SEPTEMBER 29, 2015 STRATA + HADOOP WORLD 2015 Slide 2 YOUR MISSION, SHOULD YOU CHOOSE TO ACCEPT IT So you’ll have a good idea on whether you want to stick around or not… ;) - General overview of what Geophysics is - Listing of some of my favorite python and geophysics libraries - People are doing great work, and deserve to be recognized - The final topic is going to be what I know best (I guess) – the progression of data throughout the life cycle of the oil industry Slide 3 WARNING! …OR DISCLAIMER, RATHER

Upload: paige-bailey

Post on 21-Jan-2017

937 views

Category:

Technology


3 download

TRANSCRIPT

Page 1: Solving Geophysics Problems with Python - Speaker Notes

Slide 1 SOLVING

GEOPHYSICS

PROBLEMS

WITH PYTHON

PAIGE BAILEY

SEPTEMBER 29, 2015

STRATA + HADOOP WORLD 2015

Slide 2 YOUR MISSION, SHOULD YOU CHOOSE TO ACCEPT IT

So you’ll have a good idea on whether you want to stick around or not… ;) - General overview of what Geophysics is - Listing of some of my favorite python and

geophysics libraries - People are doing great work, and deserve

to be recognized - The final topic is going to be what I know

best (I guess) – the progression of data throughout the life cycle of the oil industry

Slide 3

WARNING!

…OR DISCLAIMER, RATHER

Page 2: Solving Geophysics Problems with Python - Speaker Notes

Slide 4 PAIGE BAILEY

@DynamicWebPaige

Employed by a truly rad technology-focused O&G company by day, MS Earth Sciences graduate student at Rice University by night, founder of PyLadies-HTX (though these sometimes all bleed into one another) Background: degrees are ABA, BA Sociology, BS Geophysics – which is the weirdest combo anyone could ever have

Slide 5

WHAT IS

“GEOPHYSICS”?

Slide 6

Adrian Lenardic’s first class. Magritte actually had a series of paintings of curiously-shaped rocks suspended in space, or in natural settings. Arches national park; other curious geologic formations. How did they get there? What processes shaped them? Hydrology and the Talking Heads.

Page 3: Solving Geophysics Problems with Python - Speaker Notes

Slide 7

Slide 8

Slide 9

Page 4: Solving Geophysics Problems with Python - Speaker Notes

Slide 10

Slide 11

Slide 12

WHAT IS

“GEOPHYSICS”?

Page 5: Solving Geophysics Problems with Python - Speaker Notes

Slide 13

WHAT IS

“GEOPHYSICS”?

Slide 14 THEMES

• Gravity

• Heat flow

• Electricity

• Fluid dynamics

• Magnetism

• Radioactivity

• Mineral Physics

• Vibration

…handshakes with atmospheric sciences, geology, engineering, hydrology, planetary sciences, global positioning systems…

Huge concepts, right?

Slide 15 GRAVITY

Bouguer anomaly Geoid Geopotential Gravity anomaly Undulation of the geoid

Page 6: Solving Geophysics Problems with Python - Speaker Notes

Slide 16

20,000 feet tall Cathedral sized. More than a cathedral. For context, the Empire State Building is like 1300 feet.

Slide 17

And they’re all over the dang place. Mention the Lake Peigneur salt mine fiasco.

Slide 18 HEAT FLOW

Geothermal gradients and internal heating Suburface heat flow – whole earth geophysics Heating of hydrocarbons – if the organic material is too deeply buried, it turns into gas or “overcooks” entirely

Page 7: Solving Geophysics Problems with Python - Speaker Notes

Slide 19 FLUID DYNAMICS

Isostasy Post-glacial rebound Mantle convection Geodynamo Rate of lithospheric uplift due to Postglacial Rebound, as modelled by Paulson, A., S. Zhong, and J. Wahr. Inference of mantle viscosity from GRACE and relative sea level data, Geophys. J. Int. (2007) 171, 497–508. doi: 10.1111/j.1365-246X.2007.03556.x

Slide 20

This layered beach at Bathurst Inlet,Nunavut is an example of post-glacial rebound after the last Ice Age. Little to no tide helped to form its layer-cake look. Isostatic rebound is still underway here. Canada.

Slide 21 MAGNETISM

The Earth’s poles sometimes reverse direction – and we don’t know why. North at the bottom, south at the top. What’s interesting is that as the seafloor spreads, cools, and lithifies, certain minerals in the rock orient themselves to align with Earth’s current polarity. This means that as you check magnetism readings along the bottom of the seafloor, you see these wonderful bands

Page 8: Solving Geophysics Problems with Python - Speaker Notes

Slide 22

Whole earth perspective: Earth’s magnetic field

Slide 23 MINERAL PHYSICS

Basically materials science – researching how structures change based on differential heating, pressure, compaction. Same chemical makeup, different expressions and structures.

Slide 24 VIBRATION

(A.K.A., SEISMIC)

A great resource for this is USGS’s earthquakes website.

Page 9: Solving Geophysics Problems with Python - Speaker Notes

Slide 25 VIBRATION

(A.K.A., SEISMIC)

…WE’LL TALK ABOUT THIS MORE SOON

Slide 26 …AND UNEXPECTED USE CASES

3D-printing Geology with Python

Joe Kington’s presentation on 3D-printing cubes of geology (to get a better feel for the stratigraphy) and seismic

Slide 27 LIBRARIES / SOFTWARE

MENTIONED

MadagascarPySITSegpysegpy-pySLIMpyFatiando a TerraObsPyPyGMISimPEGSeismic Handlersgp4PyGMI

SgFmlaspyParaView Geo3ptScience

Agile Geoscience- Bruges- Modelr- Pick This- G3.js- Striplog

ArcPyPyQGIS…so many other geospatial libraries

Madagascar – multi-dimensional data analysis, including seismic processing PySIT – imaging and inversion Segpy – reading and writing SEG-Y files segpy-py – reading SEG-Y files SLIMpy – processing front end Fatiando a Terra – geophysical modeling and inversion; extensive cookbook ObsPy – seismology toolbox PyGMI – 3D interpretation and modelling of magnetic and gravity data SimPEG – simulation and parameter estimation in geophysics; great learning utility Seismic Handler – signal processing for earthquakes sgp4 – tracking earth satellites Py-ART – python ARM radar toolkit (weather

Page 10: Solving Geophysics Problems with Python - Speaker Notes

data) SgFm – sediment transport at geologic scale Laspy – LAS file conversion ParaView Geo – 3D geoscience visualization 3ptScience – Rowan Cockett’s website Bruges – modelling and post-processing seismic reflection data Modelr – seismic forward modeling on the web Pick This – social image interpretation G3.js – coming soon, a geoscience wrapper for D3.js Striplog – wrangling 1D data, usually core with varying sample rates ArcPy – geospatial processing tools for ArcGIS PyQGIS – the same, for the open-source mapping alternative QGIS University of British Columbia SEG-Y is one of the standards developed by SEG for storing geophysical data

Slide 28

ALMOST ALL

OF THAT IS

OPEN-SOURCE

BUT HERE’S THE KICKER:

Page 11: Solving Geophysics Problems with Python - Speaker Notes

Slide 29

ALMOST ALL

OF THAT IS

OPEN-SOURCE

(AND SO IS THE DATA)

BUT HERE’S THE KICKER:

USGS puts out scads of data sets; so does NASA Mention the importance of Python in geoscience research (and science research in general) because there’s a move toward reusable code and repeatable experiments “Github for scientists is just… Github.”

Slide 30 GEOPHYSICS-FOCUSED

SCIPY TALKS

2012ALGES: Geostatistics and PythongPy-ART: Python for Remote Sensing ScienceBuilding a Solver Based on PyClaw for the Solution of the Multi-Layer Shallow Water Equations

2013Modeling the Earth with Fatiando a Terra

2014 The Road to Modelr: Building a Commercial Web Application on an Open-Source FoundationMeasuring Rainshafts: Bringing Python to Bear on Remote Sensing DataThe History and Design Behind the Python Geophysical Modeling and Interpretation (PyGMI) PackagePrototyping a Geophysical Algorithm in Python

2015(and an entire Geophysics Track)

Using Python to Span the Gap Between Education, Research, and Industry Applications in GeophysicsPractical Integration of Processing, Inversion, and Visualization of Magnetotelluric Geophysial DataStriplog: Wranging 1D Subsurface DataGeodynamic Simulations in HPC with Python

SEG Hackathon – sponsored by Agile geoscience, I believe it’s their third Saturday and Sunday, October 17th and 18th so you can go to this without going to the SEG Conference as a whole, if you can’t get off work.

Slide 31 LET’S TALK ABOUT ENERGY

…but now for something completely different And apologies for focusing on the oil and gas aspects of energy.

Page 12: Solving Geophysics Problems with Python - Speaker Notes

Slide 32

FIRST WELL LOG?

Slide 33

- 1927 by Conrad Schlumberger, though he’d been formulating the idea since 1919

- He sent down a sonde (sensor attached to a wire) into a 500m deep well in the Alsace region of France and started collecting information

- “Electrical resistivity log” - All measurements were made by hand

Slide 34

FIRST

SEISMOGRAPH?

Page 13: Solving Geophysics Problems with Python - Speaker Notes

Slide 35

- 1921 by J. Clarence Karcher, who was an Electrical Engineer

- This is the means by which the majority of the world’s oil reserves have been discovered

- Founded Geophysical Service Incorporated in 1930, which eventually turned into Texas Instruments

- Got the idea because his assignment in World War I, the assignment that took him out of grad school, was to locate heavy artillery batteries in France by studying the acoustic waves the guns generated in the air.

- He noticed an unexpected event in his research and switch his concentration to seismic waves in the earth

- He thoughts it would be possible to determine the depths of the underlying geologic strata by vibrating the earth’s surface while precisely recording and timing the waves of energy

Slide 36

FIRST OIL WELL?

Page 14: Solving Geophysics Problems with Python - Speaker Notes

Slide 37

- Earliest known oil wells were drilled in China, in 347 AD

- These wells had depths of up to about 790 feet, and were drilled using bits attached to bamboo poles

Egyptians were using asphalt more than 4000 years ago, in the construction of the walls of Babylon. Ancient Persians were using petroleum for medicinal and lighting uses. The first streets of Baghdad were paved with tar. Befuddled “shoot the ground and gusher comes up” situations. Producing dozens of barrels a day, maybe hundreds, but recovery rates were exceptionally low, and you weren’t really finding anything interesting.

Slide 38

Drilling has been around for a long time, but its success is due to

improved data acquisition anddata analysis methods.

I guess the point that I’m trying to make is that… [read slide] Advances in technology create a marked step change in petroleum exploration. Those advances are primarily in terms of better hardware / equipment, which give explorers better data about the subsurface. The data is the key.

Page 15: Solving Geophysics Problems with Python - Speaker Notes

Slide 39

NOW

Slide 40

Now, I’m a geophysicist – so those advances are the ones I’m best at spotting. - Point out the upticks for 2D seismic, better

resolution for 3D seismic 80’s: 2D data acquired, pre-stack and post-stack imaging, Cray supercomputers 90’s: 3D narrow azimuth data, 3D post-stack and pre-stack imaging, Unix 00’s: 3D wide azimuth data, imaging, reverse time migration; Linux clusters Now: coil shooting, continuous machine-generated sensory data Mathematical insights – mention that last night you found out that the guy who first discovered the FFT was a Chevron employee, ain’t no thing

Page 16: Solving Geophysics Problems with Python - Speaker Notes

Slide 41

Point out fracking boom, mention that the crazy upward tick has continued, though the steepness of the slope has decreased a bit due to the drop in oil prices

Slide 42 WORLD’S LARGEST PUBLIC, STATE-OWNED,

AND PRIVATE BUSINESSES

Shamelessly stolen from Wikipedia

Slide 43 WORLD’S LARGEST PUBLIC, STATE-OWNED,

AND PRIVATE BUSINESSES

7 out of 10

7 out of 10 of the largest public, state-owned, and private businesses – and a huge proportion of the overall list. Trillions of dollars of revenue. Direct link to reserves and success of a company. We’re selling a thing; the margins on the beef jerky you buy in a gas station are higher than the margins for a barrel of oil

Page 17: Solving Geophysics Problems with Python - Speaker Notes

Slide 44

Profitability for oil companies is directly tied to reserves.

Oil companies are all in the business of getting barrels out of the ground – so characterizing the subsurface is incredibly important. Both of those bits of data that I mentioned before – that came so late in the game – were huge technological step changes for the industry, and drastically impacted oil discovery. Improved resolution within the reservoir is critical because deepwater wells cost a lot - $100 million or more – and fully exploiting assets is essential

Slide 45 UPSTREAM BIG DATA

(Seshadri M., 2013)

Slide 46

Mapping

Reservoir

Characterization

Cross-sections

Petrophysics

Reservoir SimulationWell Planning &

Drilling Simulation Stratigraphic Modeling

Seismic Interpretation

The oil industry is a bit like an ecosystem. This particular piece is subsurface characterization – the earth science-y and engineering bits - Every image you see here has a data type

(or more!) associated with it, and, though it’s getting better, a shortage of standards

Page 18: Solving Geophysics Problems with Python - Speaker Notes

Slide 47

Mapping

Reservoir

Characterization

Cross-sections

Petrophysics

Reservoir SimulationWell Planning &

Drilling Simulation Stratigraphic Modeling

Seismic Interpretation

So these components of the energy ecosystem, and this subsurface data workflow can be grouped into “earth science-y bits” and “engineering bits” with this kind of fuzzy area in between with petrophysics Earth scientists record millions and billions of data points called “seismic” and they don’t trust any of them unless you put them all together Engineers trust pressure readings in the well, the stuff they can measure with sensors – and trust it everywhere, and extrapolate everywhere

Slide 48

Mapping

Reservoir

Characterization

Cross-sections

Petrophysics

Reservoir SimulationWell Planning &

Drilling Simulation Stratigraphic Modeling

Seismic Interpretation

Something that I should also mention is that this is an iterative process. I put a loop here, but in reality, all of these steps can feed back into one another – and a change to one component of the subsurface model drastically impacts all other components New sorts of geology: horizontal drilling and hydraulic fracturing combined have been revolutionary

Slide 49

Data impacts the entire value chain.

All that I mentioned before was earth sciences or drilling related – impacting the “upstream” components of the oil industry. But in reality, data impacts every single component of the oil and gas value chain. And what’s more: it’s a variety of data, coming in at asynchronous rates.

Page 19: Solving Geophysics Problems with Python - Speaker Notes

Slide 50

How we get it, how we transport it, how we process it, how we use it – and of these components have the opportunity to be honed by analytics insights. Streamlining the transport, refinement, and distribution of O&G is vital.

Slide 51

THE

FUTURE

Slide 52

2000 – 2010 :

Decade of “Big Data”

So this past decade, the first one of the thousands, 2000 – 2010, has been the decade of “big data”. Kind of a buzzword, right? Like “in the cloud”.

Page 20: Solving Geophysics Problems with Python - Speaker Notes

Slide 53

2000 – 2010 :

Decade of “Big Data”

2010 – 2020 :

Decade of Sensing

- and if you thought there was a lot of data in this first decade, you realize there's going to be a heck of a lot more in the second.

- Mobility, infrastructure, and collaboration

technologies currently are the biggest investment areas

- In the next three to five years, investments are expected to increase in big data, the industrial IoT, and automation

- In a recent study (May 2015) from

Microsoft and Accenture, 86 – 90% of respondents said that increasing their analytical, mobile, and internet of things capabilities would increase the value of their business

- In the near term during the current low

crude price cycle, approximately 3 out of 5 respondents said they plan to invest the same amount (32%) or more or significantly more (25%) in digital technologies

89% noted that leveraging more analytics capabilities would add business value 90% felt more mobile tech in the field would add business value 86% leveraging more IIoT and automation would boost value That’s near unanimous. I’ve never seen management be unanimous about *anything*.

Page 21: Solving Geophysics Problems with Python - Speaker Notes

Slide 54

“The oil and gas upstream sector is a complex, data-driven business with data volumes growing exponentially.”

(Feblowitz, 2012)

Structured and Unstructured Data

Slide 55

V’S

Data scientists seem to really like alliteration, for whatever reason.

Slide 56

V’S

VOLUME – VARIETY – VELOCITY – VERACITY

…and all supposedly leading up to “Value”

Page 22: Solving Geophysics Problems with Python - Speaker Notes

Slide 57 VOLUME

Seismic data acquisition (wide-azimuth)

Seismic processing

5D interpolated data sets

Fiberoptics

Slide 58

How big is “big”?

In the 80’s, seismic was gigabytes in size; some people were still hand-interpreting on paper Static 5D interpolation: can produce file sets that exceed 100 TB in size. Some seismic surveys I’ve seen – regional studies – can reach petabytes. This is partially due to the way that the seismic is acquired Coil seismic has replaced lines and grids – explain why, and explain why that impacts the size of the data that you’re looking at Real-Time Shell is using fiberoptic cables created in a special partnership with HP for their sensors, and this data is transferred to AWS servers – 1TB / day And it’s not just in the engineering realm. On the business side: Chevron’s internal IT traffic alone exceeds 1.5 TB a day – and that’s 2013 numbers.

Page 23: Solving Geophysics Problems with Python - Speaker Notes

Slide 59

CAT scanning of cores What you’re seeing here is a subsection of the well Pore-scale imaging (.01 to 10 microns) can generate large data sets, as well: a centimeter cubed can exceed 10GB, and when you take into account that you’re measuring 1000 meters of core, that’s 1 exabyte Reducing the approximations, improving the equations Images taken from Schlumberger

Slide 60

STRUCTURED

Handled with specific applications used to manage surveying, processing and imaging, exploration planning, reservoir modeling, production, and other upstream activities The structured stuff’s (mostly) easy to deal with. You might not have standard naming conventions, and it might not always be as complete as you’d like, but (for the most part) you know what you’re getting and you know what it’s intended for

Slide 61

UNSTRUCTURED

Unstructured or semi-structured such as: • Emails • Word processing documents • Spreadsheets • Images • Voice recordings • Multimedia • Data market feeds • Pictures of well logs • PDF’s This all makes it difficult and costly to store in traditional data warehouses or routinely query and analyze. Enter Hadoop (or other

Page 24: Solving Geophysics Problems with Python - Speaker Notes

large-scale unstructured databases)

Slide 62 VARIETY

• Structured

• Standard data models

• SEG-Y

• WITSML

• RESQML

• PRODML

• LAS

• .shp, .lyr, other GIS files

• Unstructured

• Images (maps, embedded well logs in .PDF’s)

• Audio, video

• …and more, on both fronts

And a note – even though data is structured, it can come in a variety of formats. There’s no such thing as a pristine data set, out of the box.

Slide 63 VELOCITY

Real-time streaming data

Drilling equipment (EDR, LWD, MWD, mud logging…)

Sensors (flow, pressure, ROP, etc.)

Real-time streaming data: offshore, onshore; pipelines, refineries, in the wellbore, on machinery at the wellsite, in office buildings… But, again, it’s that variety in the velocity that’s important. We have some data that comes in immediately, and some that comes in three months later via spreadsheet. How can we consolidate and use both?

Page 25: Solving Geophysics Problems with Python - Speaker Notes

Slide 64 VERACITY

…in other words, data quality.

It’s not that great “success rate” for exploration is very low

Slide 65 VERACITY

…in other words, data quality.

…IT’S NOT

THAT GREAT.

It’s not that great “success rate” for exploration is very low

Slide 66

VALUE

…ALL LEADING UP TO

Studies show that a gradual shift to a data and technology-driven oilfield is expected to tap into 125 billion barrels of oil, equal to the current estimated reserves of Iraq Currently, recovery rates are only about 50%. The biggest risk is finding the oil; the second biggest risk is getting it out of the ground safely. Increased speed to first oil Enhanced production Reduced costs, such as non-productive time Reduced risks, especially in the area of health, environment, and safety

Page 26: Solving Geophysics Problems with Python - Speaker Notes

Slide 67

“Analytic advantages could help oil and gas companies improve production by 6% to 8%.”

(Bain Energy Report)

Our survey of more than 400 executives in many sectors revealed that companies with better analytics capabilities were twice as likely to be in the top quartile of financial performance in their industry, five times more likely to make decisions faster than their peers and three times more likely to execute decisions as planned. The evidence is compelling. …which leads to more alliteration.

Slide 68

C’S

Remember what I said about data scientists loving alliteration? So you’ve got all this data. How can you use it?

Slide 69

C’S

CREATING – CLEANING – CURATING DATASETS

The business of a data scientist.

Page 27: Solving Geophysics Problems with Python - Speaker Notes

Slide 70

C’S

CREATING – CLEANING – CURATING DATASETS

…CHALLENGES

And making sure that data from all sectors is integrated.

Slide 71

BIG 3

ADVANCED ANALYTICS TODAY

And there are opportunities for so many others – everything from HR Analytics, to looking at social media to detect political unrest, to machine learning on seismic to detect channels or slug models – things that geologists usually hunt for

Slide 72 UNCONVENTIONALS

Huge number of wells operating simultaneously

Operators need to make decisions very quickly, and are far

removed from central business units – autonomy

• Geology interpretation – comparing geology to production

• New well delivery – improving drilling and completions,

reducing lag time and minimizing the number of wells in

process at any given moment in time

• Well and field optimization – well spacing and completions

techniques (cluster spacing, number of stages, proppants

and fluids used, etc.)

“Unconventional resources” such as shale gas and tight oil supply 20% of the gas used in the USA and is expanding rapidly around the globe. Mention the tech talk that you went to that was sponsored by the SPE – Randy LaFollette, Baker Hughes flat time which crews are most efficient bit economics when to use different bits mud-motor optimization

Page 28: Solving Geophysics Problems with Python - Speaker Notes

Slide 73 CONVENTIONALS

Fewer wells in this scenario

Can still spot trends from the constant streams of

information, particularly sensors – spotting where a piece of

equipment might fail

Reducing the potential for environmental disasters

Not any of the fancy horizontal drilling. Deepwater wells are key here; onshore is less complex.

Slide 74 MIDSTEAM /

DOWNSTREAM

Monitoring pipelines and equipment for a more predictable

and precise approach to maintenance

Preventing shutdowns and launching interventions to

prevent spills

Ideally, we would have as few people operating in hazardous

locations as possible

Refineries have limited capacity, and fuel needs to be produced as close as possible to its point of end use to minimize transportation costs. Complex algorithms take into account the cost of producing the fuel as well as diverse data such as economic indicators and weather patterns to determine demand, allocate resources and set prices at the pumps.

Slide 75

Historically, oil companies relied on operating models that focused on

functional excellence and clear hand-offs from one function to the next.

This process takes time, and it breaks down when you have to make decisions quickly.

Functional excellence isn’t something that can be sacrificed, by any means – it’s just that companies are going to have to leverage technologies in more ways to accelerate the decision making process. Consider, for example, the new well delivery process, where performance metrics such as the time from spud to hookup or the dead time between steps require visibility into activity data from each function involved. If the functions (including land, regulatory, pad construction, drilling, completions and operations) run on different

Page 29: Solving Geophysics Problems with Python - Speaker Notes

systems and rely on differently constructed data models, it becomes very difficult to have a clear, integrated view of what is happening in the field.

Slide 76

Each individual function may have a wealth of data, but unless your model

can put it all in a single location, analyze it, and place that information in the right hands at the right time, it’s

difficult to improve performance.

(Bain Energy Report)

(and I’m paraphrasing) Companies that build better analytics capabilities concentrate their efforts in three areas: technology architecture, interaction between IT and the business, and hiring and retaining strong analytic talent.

Slide 77

THANKS!

Any questions?