solving geophysics problems with python - speaker notes
TRANSCRIPT
Slide 1 SOLVING
GEOPHYSICS
PROBLEMS
WITH PYTHON
PAIGE BAILEY
SEPTEMBER 29, 2015
STRATA + HADOOP WORLD 2015
Slide 2 YOUR MISSION, SHOULD YOU CHOOSE TO ACCEPT IT
So you’ll have a good idea on whether you want to stick around or not… ;) - General overview of what Geophysics is - Listing of some of my favorite python and
geophysics libraries - People are doing great work, and deserve
to be recognized - The final topic is going to be what I know
best (I guess) – the progression of data throughout the life cycle of the oil industry
Slide 3
WARNING!
…OR DISCLAIMER, RATHER
Slide 4 PAIGE BAILEY
@DynamicWebPaige
Employed by a truly rad technology-focused O&G company by day, MS Earth Sciences graduate student at Rice University by night, founder of PyLadies-HTX (though these sometimes all bleed into one another) Background: degrees are ABA, BA Sociology, BS Geophysics – which is the weirdest combo anyone could ever have
Slide 5
WHAT IS
“GEOPHYSICS”?
Slide 6
Adrian Lenardic’s first class. Magritte actually had a series of paintings of curiously-shaped rocks suspended in space, or in natural settings. Arches national park; other curious geologic formations. How did they get there? What processes shaped them? Hydrology and the Talking Heads.
Slide 7
Slide 8
Slide 9
Slide 10
Slide 11
Slide 12
WHAT IS
“GEOPHYSICS”?
Slide 13
WHAT IS
“GEOPHYSICS”?
Slide 14 THEMES
• Gravity
• Heat flow
• Electricity
• Fluid dynamics
• Magnetism
• Radioactivity
• Mineral Physics
• Vibration
…handshakes with atmospheric sciences, geology, engineering, hydrology, planetary sciences, global positioning systems…
Huge concepts, right?
Slide 15 GRAVITY
Bouguer anomaly Geoid Geopotential Gravity anomaly Undulation of the geoid
Slide 16
20,000 feet tall Cathedral sized. More than a cathedral. For context, the Empire State Building is like 1300 feet.
Slide 17
And they’re all over the dang place. Mention the Lake Peigneur salt mine fiasco.
Slide 18 HEAT FLOW
Geothermal gradients and internal heating Suburface heat flow – whole earth geophysics Heating of hydrocarbons – if the organic material is too deeply buried, it turns into gas or “overcooks” entirely
Slide 19 FLUID DYNAMICS
Isostasy Post-glacial rebound Mantle convection Geodynamo Rate of lithospheric uplift due to Postglacial Rebound, as modelled by Paulson, A., S. Zhong, and J. Wahr. Inference of mantle viscosity from GRACE and relative sea level data, Geophys. J. Int. (2007) 171, 497–508. doi: 10.1111/j.1365-246X.2007.03556.x
Slide 20
This layered beach at Bathurst Inlet,Nunavut is an example of post-glacial rebound after the last Ice Age. Little to no tide helped to form its layer-cake look. Isostatic rebound is still underway here. Canada.
Slide 21 MAGNETISM
The Earth’s poles sometimes reverse direction – and we don’t know why. North at the bottom, south at the top. What’s interesting is that as the seafloor spreads, cools, and lithifies, certain minerals in the rock orient themselves to align with Earth’s current polarity. This means that as you check magnetism readings along the bottom of the seafloor, you see these wonderful bands
Slide 22
Whole earth perspective: Earth’s magnetic field
Slide 23 MINERAL PHYSICS
Basically materials science – researching how structures change based on differential heating, pressure, compaction. Same chemical makeup, different expressions and structures.
Slide 24 VIBRATION
(A.K.A., SEISMIC)
A great resource for this is USGS’s earthquakes website.
Slide 25 VIBRATION
(A.K.A., SEISMIC)
…WE’LL TALK ABOUT THIS MORE SOON
Slide 26 …AND UNEXPECTED USE CASES
3D-printing Geology with Python
Joe Kington’s presentation on 3D-printing cubes of geology (to get a better feel for the stratigraphy) and seismic
Slide 27 LIBRARIES / SOFTWARE
MENTIONED
MadagascarPySITSegpysegpy-pySLIMpyFatiando a TerraObsPyPyGMISimPEGSeismic Handlersgp4PyGMI
SgFmlaspyParaView Geo3ptScience
Agile Geoscience- Bruges- Modelr- Pick This- G3.js- Striplog
ArcPyPyQGIS…so many other geospatial libraries
Madagascar – multi-dimensional data analysis, including seismic processing PySIT – imaging and inversion Segpy – reading and writing SEG-Y files segpy-py – reading SEG-Y files SLIMpy – processing front end Fatiando a Terra – geophysical modeling and inversion; extensive cookbook ObsPy – seismology toolbox PyGMI – 3D interpretation and modelling of magnetic and gravity data SimPEG – simulation and parameter estimation in geophysics; great learning utility Seismic Handler – signal processing for earthquakes sgp4 – tracking earth satellites Py-ART – python ARM radar toolkit (weather
data) SgFm – sediment transport at geologic scale Laspy – LAS file conversion ParaView Geo – 3D geoscience visualization 3ptScience – Rowan Cockett’s website Bruges – modelling and post-processing seismic reflection data Modelr – seismic forward modeling on the web Pick This – social image interpretation G3.js – coming soon, a geoscience wrapper for D3.js Striplog – wrangling 1D data, usually core with varying sample rates ArcPy – geospatial processing tools for ArcGIS PyQGIS – the same, for the open-source mapping alternative QGIS University of British Columbia SEG-Y is one of the standards developed by SEG for storing geophysical data
Slide 28
ALMOST ALL
OF THAT IS
OPEN-SOURCE
BUT HERE’S THE KICKER:
Slide 29
ALMOST ALL
OF THAT IS
OPEN-SOURCE
(AND SO IS THE DATA)
BUT HERE’S THE KICKER:
USGS puts out scads of data sets; so does NASA Mention the importance of Python in geoscience research (and science research in general) because there’s a move toward reusable code and repeatable experiments “Github for scientists is just… Github.”
Slide 30 GEOPHYSICS-FOCUSED
SCIPY TALKS
2012ALGES: Geostatistics and PythongPy-ART: Python for Remote Sensing ScienceBuilding a Solver Based on PyClaw for the Solution of the Multi-Layer Shallow Water Equations
2013Modeling the Earth with Fatiando a Terra
2014 The Road to Modelr: Building a Commercial Web Application on an Open-Source FoundationMeasuring Rainshafts: Bringing Python to Bear on Remote Sensing DataThe History and Design Behind the Python Geophysical Modeling and Interpretation (PyGMI) PackagePrototyping a Geophysical Algorithm in Python
2015(and an entire Geophysics Track)
Using Python to Span the Gap Between Education, Research, and Industry Applications in GeophysicsPractical Integration of Processing, Inversion, and Visualization of Magnetotelluric Geophysial DataStriplog: Wranging 1D Subsurface DataGeodynamic Simulations in HPC with Python
SEG Hackathon – sponsored by Agile geoscience, I believe it’s their third Saturday and Sunday, October 17th and 18th so you can go to this without going to the SEG Conference as a whole, if you can’t get off work.
Slide 31 LET’S TALK ABOUT ENERGY
…but now for something completely different And apologies for focusing on the oil and gas aspects of energy.
Slide 32
FIRST WELL LOG?
Slide 33
- 1927 by Conrad Schlumberger, though he’d been formulating the idea since 1919
- He sent down a sonde (sensor attached to a wire) into a 500m deep well in the Alsace region of France and started collecting information
- “Electrical resistivity log” - All measurements were made by hand
Slide 34
FIRST
SEISMOGRAPH?
Slide 35
- 1921 by J. Clarence Karcher, who was an Electrical Engineer
- This is the means by which the majority of the world’s oil reserves have been discovered
- Founded Geophysical Service Incorporated in 1930, which eventually turned into Texas Instruments
- Got the idea because his assignment in World War I, the assignment that took him out of grad school, was to locate heavy artillery batteries in France by studying the acoustic waves the guns generated in the air.
- He noticed an unexpected event in his research and switch his concentration to seismic waves in the earth
- He thoughts it would be possible to determine the depths of the underlying geologic strata by vibrating the earth’s surface while precisely recording and timing the waves of energy
Slide 36
FIRST OIL WELL?
Slide 37
- Earliest known oil wells were drilled in China, in 347 AD
- These wells had depths of up to about 790 feet, and were drilled using bits attached to bamboo poles
Egyptians were using asphalt more than 4000 years ago, in the construction of the walls of Babylon. Ancient Persians were using petroleum for medicinal and lighting uses. The first streets of Baghdad were paved with tar. Befuddled “shoot the ground and gusher comes up” situations. Producing dozens of barrels a day, maybe hundreds, but recovery rates were exceptionally low, and you weren’t really finding anything interesting.
Slide 38
Drilling has been around for a long time, but its success is due to
improved data acquisition anddata analysis methods.
I guess the point that I’m trying to make is that… [read slide] Advances in technology create a marked step change in petroleum exploration. Those advances are primarily in terms of better hardware / equipment, which give explorers better data about the subsurface. The data is the key.
Slide 39
NOW
Slide 40
Now, I’m a geophysicist – so those advances are the ones I’m best at spotting. - Point out the upticks for 2D seismic, better
resolution for 3D seismic 80’s: 2D data acquired, pre-stack and post-stack imaging, Cray supercomputers 90’s: 3D narrow azimuth data, 3D post-stack and pre-stack imaging, Unix 00’s: 3D wide azimuth data, imaging, reverse time migration; Linux clusters Now: coil shooting, continuous machine-generated sensory data Mathematical insights – mention that last night you found out that the guy who first discovered the FFT was a Chevron employee, ain’t no thing
Slide 41
Point out fracking boom, mention that the crazy upward tick has continued, though the steepness of the slope has decreased a bit due to the drop in oil prices
Slide 42 WORLD’S LARGEST PUBLIC, STATE-OWNED,
AND PRIVATE BUSINESSES
Shamelessly stolen from Wikipedia
Slide 43 WORLD’S LARGEST PUBLIC, STATE-OWNED,
AND PRIVATE BUSINESSES
7 out of 10
7 out of 10 of the largest public, state-owned, and private businesses – and a huge proportion of the overall list. Trillions of dollars of revenue. Direct link to reserves and success of a company. We’re selling a thing; the margins on the beef jerky you buy in a gas station are higher than the margins for a barrel of oil
Slide 44
Profitability for oil companies is directly tied to reserves.
Oil companies are all in the business of getting barrels out of the ground – so characterizing the subsurface is incredibly important. Both of those bits of data that I mentioned before – that came so late in the game – were huge technological step changes for the industry, and drastically impacted oil discovery. Improved resolution within the reservoir is critical because deepwater wells cost a lot - $100 million or more – and fully exploiting assets is essential
Slide 45 UPSTREAM BIG DATA
(Seshadri M., 2013)
Slide 46
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir SimulationWell Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
The oil industry is a bit like an ecosystem. This particular piece is subsurface characterization – the earth science-y and engineering bits - Every image you see here has a data type
(or more!) associated with it, and, though it’s getting better, a shortage of standards
Slide 47
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir SimulationWell Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
So these components of the energy ecosystem, and this subsurface data workflow can be grouped into “earth science-y bits” and “engineering bits” with this kind of fuzzy area in between with petrophysics Earth scientists record millions and billions of data points called “seismic” and they don’t trust any of them unless you put them all together Engineers trust pressure readings in the well, the stuff they can measure with sensors – and trust it everywhere, and extrapolate everywhere
Slide 48
Mapping
Reservoir
Characterization
Cross-sections
Petrophysics
Reservoir SimulationWell Planning &
Drilling Simulation Stratigraphic Modeling
Seismic Interpretation
Something that I should also mention is that this is an iterative process. I put a loop here, but in reality, all of these steps can feed back into one another – and a change to one component of the subsurface model drastically impacts all other components New sorts of geology: horizontal drilling and hydraulic fracturing combined have been revolutionary
Slide 49
Data impacts the entire value chain.
All that I mentioned before was earth sciences or drilling related – impacting the “upstream” components of the oil industry. But in reality, data impacts every single component of the oil and gas value chain. And what’s more: it’s a variety of data, coming in at asynchronous rates.
Slide 50
How we get it, how we transport it, how we process it, how we use it – and of these components have the opportunity to be honed by analytics insights. Streamlining the transport, refinement, and distribution of O&G is vital.
Slide 51
THE
FUTURE
Slide 52
2000 – 2010 :
Decade of “Big Data”
So this past decade, the first one of the thousands, 2000 – 2010, has been the decade of “big data”. Kind of a buzzword, right? Like “in the cloud”.
Slide 53
2000 – 2010 :
Decade of “Big Data”
2010 – 2020 :
Decade of Sensing
- and if you thought there was a lot of data in this first decade, you realize there's going to be a heck of a lot more in the second.
- Mobility, infrastructure, and collaboration
technologies currently are the biggest investment areas
- In the next three to five years, investments are expected to increase in big data, the industrial IoT, and automation
- In a recent study (May 2015) from
Microsoft and Accenture, 86 – 90% of respondents said that increasing their analytical, mobile, and internet of things capabilities would increase the value of their business
- In the near term during the current low
crude price cycle, approximately 3 out of 5 respondents said they plan to invest the same amount (32%) or more or significantly more (25%) in digital technologies
89% noted that leveraging more analytics capabilities would add business value 90% felt more mobile tech in the field would add business value 86% leveraging more IIoT and automation would boost value That’s near unanimous. I’ve never seen management be unanimous about *anything*.
Slide 54
“The oil and gas upstream sector is a complex, data-driven business with data volumes growing exponentially.”
(Feblowitz, 2012)
Structured and Unstructured Data
Slide 55
V’S
Data scientists seem to really like alliteration, for whatever reason.
Slide 56
V’S
VOLUME – VARIETY – VELOCITY – VERACITY
…and all supposedly leading up to “Value”
Slide 57 VOLUME
Seismic data acquisition (wide-azimuth)
Seismic processing
5D interpolated data sets
Fiberoptics
Slide 58
How big is “big”?
In the 80’s, seismic was gigabytes in size; some people were still hand-interpreting on paper Static 5D interpolation: can produce file sets that exceed 100 TB in size. Some seismic surveys I’ve seen – regional studies – can reach petabytes. This is partially due to the way that the seismic is acquired Coil seismic has replaced lines and grids – explain why, and explain why that impacts the size of the data that you’re looking at Real-Time Shell is using fiberoptic cables created in a special partnership with HP for their sensors, and this data is transferred to AWS servers – 1TB / day And it’s not just in the engineering realm. On the business side: Chevron’s internal IT traffic alone exceeds 1.5 TB a day – and that’s 2013 numbers.
Slide 59
CAT scanning of cores What you’re seeing here is a subsection of the well Pore-scale imaging (.01 to 10 microns) can generate large data sets, as well: a centimeter cubed can exceed 10GB, and when you take into account that you’re measuring 1000 meters of core, that’s 1 exabyte Reducing the approximations, improving the equations Images taken from Schlumberger
Slide 60
STRUCTURED
Handled with specific applications used to manage surveying, processing and imaging, exploration planning, reservoir modeling, production, and other upstream activities The structured stuff’s (mostly) easy to deal with. You might not have standard naming conventions, and it might not always be as complete as you’d like, but (for the most part) you know what you’re getting and you know what it’s intended for
Slide 61
UNSTRUCTURED
Unstructured or semi-structured such as: • Emails • Word processing documents • Spreadsheets • Images • Voice recordings • Multimedia • Data market feeds • Pictures of well logs • PDF’s This all makes it difficult and costly to store in traditional data warehouses or routinely query and analyze. Enter Hadoop (or other
large-scale unstructured databases)
Slide 62 VARIETY
• Structured
• Standard data models
• SEG-Y
• WITSML
• RESQML
• PRODML
• LAS
• .shp, .lyr, other GIS files
• Unstructured
• Images (maps, embedded well logs in .PDF’s)
• Audio, video
• …and more, on both fronts
And a note – even though data is structured, it can come in a variety of formats. There’s no such thing as a pristine data set, out of the box.
Slide 63 VELOCITY
Real-time streaming data
Drilling equipment (EDR, LWD, MWD, mud logging…)
Sensors (flow, pressure, ROP, etc.)
Real-time streaming data: offshore, onshore; pipelines, refineries, in the wellbore, on machinery at the wellsite, in office buildings… But, again, it’s that variety in the velocity that’s important. We have some data that comes in immediately, and some that comes in three months later via spreadsheet. How can we consolidate and use both?
Slide 64 VERACITY
…in other words, data quality.
It’s not that great “success rate” for exploration is very low
Slide 65 VERACITY
…in other words, data quality.
…IT’S NOT
THAT GREAT.
It’s not that great “success rate” for exploration is very low
Slide 66
VALUE
…ALL LEADING UP TO
Studies show that a gradual shift to a data and technology-driven oilfield is expected to tap into 125 billion barrels of oil, equal to the current estimated reserves of Iraq Currently, recovery rates are only about 50%. The biggest risk is finding the oil; the second biggest risk is getting it out of the ground safely. Increased speed to first oil Enhanced production Reduced costs, such as non-productive time Reduced risks, especially in the area of health, environment, and safety
Slide 67
“Analytic advantages could help oil and gas companies improve production by 6% to 8%.”
(Bain Energy Report)
Our survey of more than 400 executives in many sectors revealed that companies with better analytics capabilities were twice as likely to be in the top quartile of financial performance in their industry, five times more likely to make decisions faster than their peers and three times more likely to execute decisions as planned. The evidence is compelling. …which leads to more alliteration.
Slide 68
C’S
Remember what I said about data scientists loving alliteration? So you’ve got all this data. How can you use it?
Slide 69
C’S
CREATING – CLEANING – CURATING DATASETS
The business of a data scientist.
Slide 70
C’S
CREATING – CLEANING – CURATING DATASETS
…CHALLENGES
And making sure that data from all sectors is integrated.
Slide 71
BIG 3
ADVANCED ANALYTICS TODAY
And there are opportunities for so many others – everything from HR Analytics, to looking at social media to detect political unrest, to machine learning on seismic to detect channels or slug models – things that geologists usually hunt for
Slide 72 UNCONVENTIONALS
Huge number of wells operating simultaneously
Operators need to make decisions very quickly, and are far
removed from central business units – autonomy
• Geology interpretation – comparing geology to production
• New well delivery – improving drilling and completions,
reducing lag time and minimizing the number of wells in
process at any given moment in time
• Well and field optimization – well spacing and completions
techniques (cluster spacing, number of stages, proppants
and fluids used, etc.)
“Unconventional resources” such as shale gas and tight oil supply 20% of the gas used in the USA and is expanding rapidly around the globe. Mention the tech talk that you went to that was sponsored by the SPE – Randy LaFollette, Baker Hughes flat time which crews are most efficient bit economics when to use different bits mud-motor optimization
Slide 73 CONVENTIONALS
Fewer wells in this scenario
Can still spot trends from the constant streams of
information, particularly sensors – spotting where a piece of
equipment might fail
Reducing the potential for environmental disasters
Not any of the fancy horizontal drilling. Deepwater wells are key here; onshore is less complex.
Slide 74 MIDSTEAM /
DOWNSTREAM
Monitoring pipelines and equipment for a more predictable
and precise approach to maintenance
Preventing shutdowns and launching interventions to
prevent spills
Ideally, we would have as few people operating in hazardous
locations as possible
Refineries have limited capacity, and fuel needs to be produced as close as possible to its point of end use to minimize transportation costs. Complex algorithms take into account the cost of producing the fuel as well as diverse data such as economic indicators and weather patterns to determine demand, allocate resources and set prices at the pumps.
Slide 75
Historically, oil companies relied on operating models that focused on
functional excellence and clear hand-offs from one function to the next.
This process takes time, and it breaks down when you have to make decisions quickly.
Functional excellence isn’t something that can be sacrificed, by any means – it’s just that companies are going to have to leverage technologies in more ways to accelerate the decision making process. Consider, for example, the new well delivery process, where performance metrics such as the time from spud to hookup or the dead time between steps require visibility into activity data from each function involved. If the functions (including land, regulatory, pad construction, drilling, completions and operations) run on different
systems and rely on differently constructed data models, it becomes very difficult to have a clear, integrated view of what is happening in the field.
Slide 76
Each individual function may have a wealth of data, but unless your model
can put it all in a single location, analyze it, and place that information in the right hands at the right time, it’s
difficult to improve performance.
(Bain Energy Report)
(and I’m paraphrasing) Companies that build better analytics capabilities concentrate their efforts in three areas: technology architecture, interaction between IT and the business, and hiring and retaining strong analytic talent.
Slide 77
THANKS!
Any questions?