arxiv:2108.09605v1 [physics.geo-ph] 22 aug 2021 mailto

Citation O.J. Aribido, G. AlRegib, and Y. Alaudah, “Self-Supervised Delineation of Geological Structures using OrthogonalLatent Space Projection,” in Geophysics, vol. 86, no. 6, accepted Jul. 29 2021

Review Accepted for publication: 29 July 2021

Codes https://github.com/olivesgatech/Latent-Factorization

Bibtex @article {aribido2021self,author = Aribido Oluwaseun Joseph and Ghassan AlRegib and Yazeed Alaudah,title = Self-Supervised Delineation of Geological Structures using OrthogonalLatent Space Projection,journal = GEOPHYSICS,volume = 86,number = 6,year = 2021,}

Copyright ©2021 Geophysics. A revised version of this manuscript has been accepted to Geophysics and is awaitingproduction. The copyrights for the accepted manuscript belong strictly to the Society for Exploration Geophysicists(SEG). This document may strictly be used only for educational and other non-commercial purposes only. The fullcitation to the accepted manuscript will be made available once the DOI has been published.

Contact mailto: [email protected]: [email protected], Website: https://ghassanalregib.infomailto: [email protected]

arX

iv:2

108.

0960

5v1

[ph

ysic

s.ge

o-ph

] 2

2 A

ug 2

021

Delineation of Geological Structures 1

Self-Supervised Delineation of Geological Structures using OrthogonalLatent Space Projection

Oluwaseun Joseph Aribido∗, Ghassan AlRegib∗, Yazeed Alaudah†

∗ Center for Energy and Geo Processing (CeGP), GA 30332-0250, U.S.A., † King Fahd University of Petroleum and Engineering(KFUPM)

ABSTRACT

We developed two machine learning frameworks thatcould assist in automated litho-stratigraphic interpre-tation of seismic volumes without any manual handlabeling from an experienced seismic interpreter. Thefirst framework is an unsupervised hierarchical clus-tering model to divide seismic images from a volumeinto certain number of clusters determined by the al-gorithm. The clustering framework uses a combina-tion of density and hierarchical techniques to determinethe size and homogeneity of the clusters. The secondframework consists of a self-supervised deep learningframework to label regions of geological interest in seis-mic images. It projects the latent-space of an encoder-decoder architecture unto two orthogonal subspaces,from which it learns to delineate regions of interest inthe seismic images. To demonstrate an application ofboth frameworks, a seismic volume was clustered intovarious contiguous clusters, from which four clusterswere selected based on distinct seismic patterns: hori-zons, faults, salt domes and chaotic structures. Imagesfrom the selected clusters are used to train the encoder-decoder network. The output of the encoder-decodernetwork is a probability map of the possibility an am-plitude reflection event belongs to an interesting geo-logical structure. The structures are delineated usingthe probability map. The delineated images are furtherused to post-train a segmentation model to extend ourresults to full-vertical sections. The results on verticalsections show that we can factorize a seismic volumeinto its corresponding structural components. Lastly,we showed that our deep learning framework could bemodeled as an attribute extractor and we compared ourattribute result with various existing attributes in liter-ature and demonstrate competitive performance withthem.

2 Aribido, AlRegib & Alaudah

INTRODUCTION

Seismic interpretation requires detailed understandingof seismic acquisition, processing, and data models to infergeological meaning. The process of seismic data prepro-cessing and migration involves geometric transformationand analysis to produce an accurate image of Earth’s sub-surface. Prestack and poststack preprocessing can intro-duce artifacts or coherent noise into seismic data, leadingto false-positive identification of seismic structures duringinterpretation. Hence, consistent accurate interpretationof seismic data requires many years of experience.

Despite improvement in the quality of 3-D migrated seis-mic data, thorough interpretation of certain geologic el-ements remains subjective. Subjective interpretation oc-curs due to the presence of many ‘valid’ interpretations orweak amplitude reflections. One cause of weak amplitudereflection is the absence of bottom simulating reflectors inthe subsurface during acquisition (Bedle, 2019). In addi-tion, interpreted seismic data are considered intellectualproperty by energy industries. Consequently, publicly an-notated data are scarce. All these challenges necessitatethe need for a model-based framework that is objective todefined constraints, requires little or no human-assistedlabel, and powerful enough to learn deep-diverse patternsin seismic data. Recent deep learning success stories havefurther motivated research for automated seismic inter-pretation using machine learning. But the limited labelproblem is a challenge to training deep learning mod-els imported from computer vision applications where la-bels ramp into millions in number. Consequently, deeplearning models designed for seismic applications must betrained with limited data awareness.

To address these challenges, we propose a self-supervisedlearning framework that does not require any labels frominterpreters. Rather, the model is trained on the physicsof seismic patterns, from which homogeneous patternsare separated out using constraints. Various computer-assisted frameworks have been proposed in the literature,of which we make two broad categories: attributes-basedinterpretation and machine learning-based interpretation.Seismic attributes-based methods rely on mathematicalcomputations to identify distinctive patterns of seismicamplitudes. These patterns are mapped to a database ofprevious successful patterns for successful interpretation(Chopra and Marfurt, 2005). Because the computationof geometric attributes can be automated, many seismicattributes have been introduced to the geoscience commu-nity (Taner et al., 1994; Barnes, 1992; Chen and Sidney,1997; Shafiq et al., 2015, 2017, 2018a,b). However, seismicattributes are usually designed to identify specific patternsof interest to an interpreter. Hence, patterns that deviatefrom the specific target are unidentified. This implies asuite of complementary attributes must be selected by aninterpreter interested in identifying important events. Ourproposed model, however, learns patterns without previ-ous specification bias. In addition, it contains millions ofparameters which is powerful enough to learn very com-

plex patterns that would be missed by simpler algorithmssuch as attributes.

Early adoption of machine learning models in seis-mic research began with supervised methods in which themodel has access to labeled data. (Di et al., 2018b; Wuet al., 2018; Xiong et al., 2018; Araya-Polo et al., 2017;Dramsch and Luthje, 2018). Several notable works havealso attempted to overcome the effect of limited anno-tated data by employing semi-supervised and weakly su-pervised techniques (Alaudah and AlRegib, 2016; Alau-dah et al., 2019a; Di and AlRegib, 2019; Babakhin et al.,2019; Alfarraj and AlRegib, 2019; Wu et al., 2019; Liuet al., 2019; Di et al., 2018c). In these frameworks, there isless dependence on fully labeled data. In semi-supervisedframeworks, for instance, researchers use fewer annota-tions augmented with pseudo-labels. Weakly supervisedlearning models use weaker labels like image labels only(Alaudah et al., 2018). Weak labels are easier to generatein numerous quantities compared to pixel-level annota-tions required in supervised frameworks.

In contrast, our method does not require any anno-tated labels from interpreters. This ranks it at the sameease of use as attributes, with the exception that we ap-plied a more powerful learnable model. Another relevantbody of literature explores unsupervised learning workswhich also addresses the limited data problem. Theseframeworks are mostly based on K-Means clustering andKonohen’s self-organizing maps (SOMs) (Barnes and Laugh-lin, 2002). The basic workflow includes extracting at-tributes from seismic data and using a dimension-reductionalgorithm, mostly principal component analysis (PCA),to identify the most important features. The principalfeatures are then clustered to a specified number of cen-troids. Although these methods have produced great re-sults, the ground-truth centroids of the attributes are un-known; and manually initializing centroids does not con-verge to ground-truth centroids. Secondly, PCA leads toinformation loss and the distance metric used in cluster-ing algorithms is usually Euclidean based, which affectsthe cluster accuracy. Our proposed clustering frameworkdoes not require the number of centroids to be prede-fined, rather our algorithm explores multi-scaled, direc-tional spectral information in the images to extract high-dimensional coefficients before clustering them using acustom-distance metric.

Lastly, we explore other relevant literature based ontheir target applications. For instance, detection of faults(Araya-Polo et al., 2017; Di et al., 2019; Di and AlRegib,2019; Shafiq et al., 2018b; Wu et al., 2019; Xiong et al.,2018), delineation of salt bodies, (Di et al., 2018a,c; Shafiqet al., 2015), classification of facies (Liu et al., 2019; Dram-sch and Luthje, 2018; Qian et al., 2017; Alaudah et al.,2019b,c), prediction of rock lithology from well logs (Al-farraj and AlRegib, 2019; Das et al., 2018, 2019) and seg-mentation of seismic layers are few areas of relevant ap-plications of deep learning to seismic. In these literature,various depth of labels are used to train the network. Our


proposed method eliminates this challenge by learning im-age labels using an unsupervised framework. Dubrovinaet al. (2019) propose using the latent space of an encoder-decoder architecture to split and rearrange various parts ofa 3D object. The latent space is projected to summableorthogonal subspaces. Each orthogonal subspace of thelatent variable retains low-level features of various partsof the 3D object. However, the dataset includes ground-truth labels of 3D parts, our method does not include anyground-truths for partitioned parts. Li et al. (2019) addan orthogonal penalty to latent variables and show thatby using SOMs on the orthogonal features, more separa-bility on pixel-space features is realized. The methodologypresented in Li et al. (2019) is similar to ours in applyingorthogonality to latent variables. In our self-supervisedframework, we introduced projection matrices to factor-ize input images; hence, eliminating the need for SOM onthe features.

In this work, we use the F3 block as our dataset.The F3 block is an offshore block in the Dutch sector ofthe North Sea. The dataset is preprocessed using a dip-median filter to remove random noise and to enhance theedges of the seismic reflections. Seismic traces in the vol-ume are clipped above and below 4.0 times the standarddeviation of the volume. All amplitude values are normal-ized to the range [-1, 1]. The volume is split into 120x120overlapping patches. We propose a new hierarchical clus-tering model to group these patches into K classes. Fourclasses out of all classes are passed to a deep encoder-decoder model, augmented with two discriminators. Thelatent space of the encoder-decoder model is projectedunto two learned subspaces. We added constraints toguide the factorization of each input patch into two im-ages. Each synthesized image corresponds to a learnedsubspace. The subspaces are further constrained to be or-thogonal. Though we use discriminators, our method isself-supervised because the adversarial part of our modelis used in a multi-task setting aside from learning the fac-torized latent space, which is done unsupervised. Lastly,we conclude by evaluating our proposed method againstrelated research. We further show that our deep encoder-decoder model learns attributes that are qualitatively bet-ter than traditional attributes for structural delineation.

HIERARCHICAL CLUSTERINGFRAMEWORK

We propose a hierarchical clustering framework to clus-ter images into K < 20 clusters. Where K is the numberof clusters. The volume is subdivided into subset blocks.Each subset block contains 15 vertical sections in the in-line or crossline direction. We separate inline and crosslinesubset blocks. The clustering framework consists of sev-eral layers. The first layer is initialized using a density-based algorithm: density-based spatial clustering of ap-plications with noise (DBSCAN) (Ester et al., 1996), toproduce contiguous clusters. All other layers consist of hi-erarchical merges of clusters in proceeding layers. Three

sections are extracted from each subset block taken atfive intervals apart. A set of three sections at intervals offive apart, is called a category. Hence, each subset-blockcontains three categories.

We do not randomize all the images collected fromthe volume to make them independent and identically dis-tributed, to avoid losing domain correlation informationbetween sections. However, within each category, we ran-domize the order of all images. Each category contains270 images and 165 images for inline and crossline re-spectively. Inline categories are clustered separately fromcrossline categories due to differences in feature represen-tation in both sets.

Clustering with DBSCAN

DBSCAN uses density-based metrics for cluster discov-ery. Two parameters are usually defined for DBSCAN:m and ε. m is the minimum number of points that canform an independent cluster, while ε is the maximum dis-tance between two core points. Given a group of points,DBSCAN determines whether a point is a core or bor-der point. The former belongs to a cluster with at leastm minimum points within ε distance of each other. Thelatter is within ε distance from any core point but notwithin ε distance to m points. At initialization, one clus-ter is randomly chosen from the dataset and neighboringpoints are checked to find whether they are core or borderpoints. If m points are core points, a cluster is establishedand other core points and border points ε distant awayare added to that cluster. Points that are neither core norborder point, are considered noise. In Figure 1, we show atoy example of DBSCAN with m = 3. The red points arecore points because they have m ≥ 3 neighboring pointsat ε distance away. The yellow points are border points,because they are ε distant from less than m points, butare in the neighborhood of a core point. All core pointsare considered density-reachable from other core points.The blue points are considered noise because they are notε distant from any point in the group.

To specify a distance metric for DBSCAN betweentwo seismic images, we used a similarity (SIM) metricproposed by Alfarraj et al. (2016). The authors showedthat SIM compares curvelet coefficients from two images.Curvelets coefficients (Starck et al., 2002) are multi-scaleddecompositions of an image obtained by tiling the fre-quency domain with trapezoidal-shaped tiles (Alfarraj et al.,2016). Additionally, SIM outperforms many other sim-ilarity measures for texture images like S-SSIM (Wanget al., 2004), CW-SSIM (Gao et al., 2011), and STSIM(Zujovic et al., 2013). The tiled frequencies collected atvarious scales (here 32) of the image in various directions,make the algorithm a good edge detector. Hence, suitablefor seismic images. Singular value decomposition (SVD)is applied to the curvelet coefficients extracted from theimages to select the best coefficients. The reduced coef-ficients are then compared using Czenkanoski’s similarity


epsnoise

border point

border point

core point

Figure 1: A toy example of DBSCAN algorithm, m = 3.The radius of the circles represent ε distance from eachpoint. The red points are core points because they arewithin m neighbors of each other. Yellows points are bor-der points because they have less than m points in theirε neighborhoods. The blue point is considered noise be-cause it is not within ε distance to any other point.

to obtain a score in the range [0, 1] (Alfarraj et al., 2016):

SIM(I1, I2) =||v1 − v2||1||v1 + v2||1

. (1)

The SIM(I1, I2) returns values from [0, 1] for each im-age pair, where I1 and I2 are images and v1 and v2 arethe SVD reduced curvelet coefficients. Hence, a 0 impliesperfectly similar, while 1 denotes perfectly dissimilar.

Inline and crossline images are fed sequentially into theDBSCAN algorithm. By experimenting with several hy-perparameter values, we set ε = 0.10, for inline images andε = 0.09, for crosslines, while m = 3 for both inline andcrossline images. Thus, for any image occurring at anylocation within any section, if that image is not ε distantto any other image in the same category, it is marked asnoise. We keep ε small to ensure thorough discriminationbetween clusters. The drawback of a small ε is that therewill be many (∼ 30), small clusters per category. How-ever, this remains solvable since a hierarchical algorithmis applied to merge contiguous clusters.

Algorithm 1

In the first step, we merge intra-category clusters. Asshown in Figure 2, we arrange the clusters in order of clus-ter size. Each circle represents a cluster. The blue columnrepresent one category. We select a handful of images fromthe smallest cluster and compare these images to remain-ing clusters inside the category. The smallest cluster isthen merged with the most similar cluster, in this case,cluster three. The algorithm is iterated until the num-ber of clusters reduces from ∼ 30 to a predefined smallnumber (e.g., 8).

Algorithm 2

In the second step, we merge inter-category clusters. InFigure 3, the blue column is category A, the yellow columnis category B. A few images are randomly sampled from

1

2

3

4

N

…

sortby

cluster size

1

2

3

4

N

…

2

3,1

4

N

…

Figure 2: Algorithm 1 procedure for intra-category merg-ing of clusters. The blue circles are clusters arranged inascending order of magnitude. For one merge instance,we compare cluster 1 with all other clusters till we findthe closest match, e.g., cluster 3. In which case, we mergeclusters 1 and 3 and re-sort all clusters again.

each cluster in A and compared with all clusters in B. Allclusters in A with a one-to-one mapping, in best proximityto clusters in B, are merged. For instance, clusters 1, 2,and 4 in A all select cluster 2 in B as their closest cluster(many-to-one mapping), we do not merge any of theseclusters. However, we merge clusters 3-to-1 and N-to-N.After merging similar clusters, the remaining clusters inB are transferred to A.

1

2

3

4

N

…

1

2

3

4

N

…

1

2

3

4

N

…

1

2

3

4

N

…

1

2

3,1

2

4

…

N,N

Figure 3: Algorithm 2 procedure for inter-category merg-ing of clusters. The blue and yellow columns are two cat-egories: A and B, whose clusters are to be merged. Wecompare each cluster in A with every other cluster in Bto find the closest match. In this illustration, clusters 1, 2and 4 in A all select cluster 2 in B as their closest match.Cluster 3 in A maps to cluster 1 in B. Assuming for allother clusters in A, we obtain a one-to-one mapping inclose match to clusters in B, we only match clusters withone-to-one matching between A and B. We then copyover clusters with many-to-one (1, 2 and 4) mapping fromA to B.

Figure 4 illustrates how the hierarchical clustering mod-ule works. Both algorithms are applied alternatively onthe data until the total number of categories are reducedto 8 and the total number of clusters inside all categoriesare reduced to 14. Lastly, we inspect all 14 clusters. Thetotal number of clustered images was 6386. There areeight visually unique clusters, including noise. Hence, wecombine repetitive clusters. For instance, dipping ampli-tudes and non-dipping amplitudes can be combined intothe ‘horizons’ cluster.


… …

Alg1 Alg1

… …

Alg2

… …

Alg1 Alg1

… …

Alg2

Alg1

30 clusters 30 clusters 30 clusters30 clusters

8 clusters 8 clusters

12 clusters

14 clusters

12 clusters

8 clusters 8 clusters

L!

L"

L#

Figure 4: Hierarchical cluster merging. L0 is the firstlayer and contains clusters from DBSCAN. Subsequentlayers are obtained by iterating algorithm 1 and 2. L1 isthe second layer of clusters after the first iteration of bothalgorithms. L2 is the third layer and so on.

In Figure 5, Cluster 0 is labeled as ‘chaotic’ becausethe dominant geologic component highlighted by the clus-tering algorithm is the chaotic facies. Cluster 1 containsimages from the salt dome region of the volume. In thiscluster, it is interesting that although the patterns of thereflection amplitudes are diverse, our algorithm clustersthem accurately. Cluster 2 identifies parallel reflectors,that we label as ‘horizons’. The horizons cluster con-tains the most number of images of all clustered images,and the clustering algorithm also attains the highest ac-curacy in identifying this cluster. Cluster 3 capturesfaulted regions of the volume and are labeled as faults.Notice some discontinuities are labeled as faults. Cluster4 captures another variant of the chaotic class with somenon-conformity in seismic reflections. Cluster 5 containsimages taken from the bright-spot region of the seismicvolume. Interestingly, our algorithm differentiated thisclass of images from the horizon class in Cluster 2. Clus-ter 6 highlights images from the top of the salt dome andthey were clustered into a different class by our algorithmbecause they did not fully represent structures capturedinside the salt dome. Cluster 7 shows irregular seismicreflections. Two of them contain chaotic regions whileothers do not fit into any of the other classes. There areseveral classes like cluster 7 and 5 that would not be usedin training our deep learning framework, but would be leftfor further study in future research. Figure 5 contains theoutput of our clustering framework.

Four clusters were selected from Figure 5 for train-ing the deep learning model. These clusters would bereferred to as classes in future references. From the re-sult in Figure 5 Horizons (class 2), faults (class 3), saltdomes (class 1) and chaotic (class 0) labels are represen-tative of the selected images. The ‘horizon’ cluster is thelargest class with 1258 images and the Chaotic class isthe smallest class with 389 images. The Fault and thesalt dome classes had 720 and 500 images, respectively.We discarded images clustered as noise. Hence, for bal-

anced classes, we randomly select 389 images from eachclass, making a total of 1556 training images.

DEEP LEARNING FRAMEWORK

We have a set of clustered seismic images in four classes,each with equal number of images. The assignment of im-age labels has been done by our clustering module similarto the example in Figure 6a. We propose an encoder-decoder model to learn annotations as shown in Figure6b. Clustered images were not pixel annotated, Figure6b is only for illustration. The architecture of the pro-posed model is shown in Figure 7. The set of all trainingimages is designated as X : {x1,x2, ...,xN}. The encoderis designated: qφ(xi) and the decoder is designated pθ(zi).Additionally, zi ∈ Z is the corresponding latent vector toxi. The learning parameters in both encoder and decoderare φ and θ respectively, and pθ(X) is the probability dis-tribution of all seismic images over parameter θ. Thus, thelog-probability of the data is: log pθ(X) =

∑i log pθ(xi).

Now, pθ(xi) is intractable because the posterior pθ(xi|zi)is intractable (Kingma and Welling, 2013). For notationalconvenience, we will drop the subscript i on xi and zi, asall future references to these variables refers to a singledata instance during iterative training or inference.

We introduce two projection matrices: P1 and P2.Both matrices are fully connected layers of size 1024x1024.The matrices project zi to orthogonal z1 and z2 subspacesrespectively. In Figure 6b, z1 and z2 are mapped to theblue and red annotations in each input image. Note thatwe do not assume there are no images with overlappingclass representations in our clusters. In one forward pass,operators P1 and P2 are applied to z thus:

z1 = P1z, z2 = P2z. (2)

x1 and x2 are synthesized from the decoder, pθ(z). Thedecoder tries to learn some properties of each seismic im-age and furnish such knowledge into reconstructing bothimages.

x1 ∼ pθ(z1), x2 ∼ pθ(z2). (3)

We desire the reconstructed image r to be as similar tothe input image as possible. Note that r is usually blurryin encoder-decoder networks. Hence, we use a discrimina-tor, D1, to ensure r is sharp. A second discriminator, D2,is applied to the encoder to ensure latent variables of zare spread out. In the original Bayes auto-encoder archi-tecture, Kingma and Welling (2013) derive the evidencelower bound for reconstructing x. Because we modify thebehavior of the latent space, by projecting it to orthogo-nal latent spaces, we re-derive the evidence lower boundin the context of projecting the latent space. The decoderdistribution, pθ(X) is intractable because the posterior isintractable. Hence, we sample from an auxillary distri-bution, qφ(z). We drop θ and φ for notation convenienceknowing p and q are distributions over both parametersrespectively. The derivation of the evidence lower bound(ELBO) on the log-likelihood of X is as follows:


Figure 5: Result showing eight clusters at the final layer.

a)

b)

Figure 6: A sample section image from inline 400 (a) show-ing four samples of images clustered by our hierarchicalmodel, (b) shows manually labeled structural componentsversus background. Structural components are annotatedin green and the background in red.

Ex∼px (log (pθ(x))) is log-likelihood of generating the data.Next, we introduce z1 and z2 and sum over their marginals:

Ex∼px (log (pθ(x)))

= Ex∼px

[log

(∑z1

∑z2

pθ(x, z1, z2)

)].

(4)

We sample from q(z|x) and find the expectation over itsconditional distribution because we cannot sample from

the posterior p(x|z)):

(4) ≥ Ex∼px

[Ez1∼q(z1|x) log

(p(z1)

q(z1|x)

)+ Ez2∼q(z2|x) log

(p(z2)

q(z2|x)

)log

(p(z2)

q(z2|x)

)]+Ex∼px

[Eq(z1,z2)|x) log (p(x|z1, z2)))

].

(5)

Simplifying expressions in (5) in KL divergence termsgives:

Ex∼px [−KL(q(z1|x)||p(z1))−KL(q(z2|x)||p(z2))]

+Eq(z1,z2|x)log(p(x|z1, z2)).

(6)

We can write the KL terms in (6) in entropy terms:

Ex∼pdata

[Eqφ(z1,z2|x) log (pθ(x|z1, z2))

+ H(q(z1|x)) +H(q(z2|x))−H (p(z1))−H (p(z2))] ,(7)

where KL(a||b) is the Kullback-Leibler divergence be-tween distributions a and b. We could conclude fromequation 7 that: evidence ≥ reconstruction + entropy ofprojected latent codes - entropy of actual latent priors. Inequation 7, the entropy of the encoder is H(q(z|x)) andthe conditional entropy of the encoder due to the pro-jected prior is H(q(z1|x)) and H(q(z2|x)). The entropiesof the priors in the orthogonal subspaces are: H(p(z1))and H(p(z2)). If the conditional entropies of the the en-coder after projecting the latent space matches the en-tropies of the projected priors: H(q(z1|x))−H(p(z1)) = 0and H(q(z2|x)) − H(p(z2)) = 0, then the reconstructedimage r will be a perfect reconstruction of x. This wouldonly happen if the projected priors are perfectly sampledfrom the actual priors of distributions representing theforeground and background of each x. Two images are re-constructed by the decoder: x1 and x2. Here, we assumethat the encoder learns not only to generate the latent


codes, but understands that it needs to generate two la-tent codes that match the two virtual distributions in thedecoder. Hence, it must generate one latent code to en-capsulate two virtual distributions in the decoder. Thedecoder learns parameters θ such that each reconstructedseismic image has two factors inherent in them. It remainsthat the decoder must be constrained to understand bothfactors.

Latent space factorization using projectionmatrices

A projection matrix P is an n × n square matrix thatgives a vector subspace representation T that is a pro-jection of Rn to T . For any real operator P to be a validprojection matrix, it must satisfy the following properties:

1. P = P∗ (P∗ is the adjoint of P).

2. P2 = P (idempotent property).

Figure 7 shows two projection matrices P1 and P2 de-signed to be mutually orthogonal. However, we imposeno constraint to make either of them orthogonal projec-tion matrices in which case we would need P = PT for areal P. To impose a projection matrix behavior on bothmatrices, we create a fully connected layer in the modeland initialize it using a uniform distribution. Now, bothmatrices are learnable and both projection and orthogo-nality constraints can be imposed on them by solving anoptimization objective. We formulate the following pro-jection loss function:

Lproj =

2∑i,j, i6=j

PTi Pj +

2∑i=1

(P2i −Pi). (8)

Where Lproj is the projection loss added to adversariallosses. We discuss the use of discriminators and their cor-responding losses below.

Adversarial Training

Goodfellow et al. (2014) introduce Generative Adver-sarial Networks (GANs). A GAN sets up an adversarialmin-max game between a generator G and a discrimina-tor D. At training, the generator attempts to generatean image that lies in the dataset distribution space suchthat the discriminator is unable to distinguish if the imageis real or generated. While the discriminator gets betterat detecting real and generated (fake) images. This ad-versarial setup constrains the generator to generate sharpimages almost at the quality of the input image. GANs aresuperior in the quality of generated images they producecompared to basic encoder-decoder models. Images recon-structed from encoder-decoder models are usually blurry.Blurry images are due to using the mean squared error(MSE) loss between the reconstructed and original im-age, implying the error between both images is Gaussian

noise, which is not in the case in seismic images (Zhaoet al., 2017).

For notation purpose, our decoder would pass as a gen-erator G in this section, while our encoder will be repre-sented as E in adversarial setting. The two discriminatorsspecified in Figure 7 retain their notations as: D1 and D2.D1 ensures a sharp reconstruction, r, while D2 helps inthe spread of latent variables, z. As seen in Figure 7,r = x1 + x2. We introduce an L1 loss between x1 and x2

to enforce structural difference:

Ldiff = −1×N∑n=1

|x1 − x2|. (9)

Where Ldiff is the L1 penalty. Since discriminator D1

takes r as input, the adversarial loss further ensure r issimilar to X. The adversarial loss for D1 versus G be-comes:

minG

maxD1

Ex∼pdata [log(D1(x))]

+Ez∼q(z) [log(1−D1(G(z)))] .(10)

Similarly, we define an adversarial loss on the latent spacez.

Training a GAN is very challenging. One experimen-tal problem with training GANs is the problem of mode–collapse. Mode–collapse occurs when G reconstructs thesame output image for varying latent variable z. In whichcase, r converges to a local minimum of one or few imageswithout generalizing properly on the whole data distribu-tion. To solve this problem, we apply discriminator D2

on z. D2 helps to impose a uniform distribution on z andto flatten out the mixed Gaussian space of z.

minG

maxD2

Ez∼q(z) [log(D2(z))] + Eu∼U[0,1][log(1−D2(u))] .

(11)Equation 11 defines the loss on D2, to flatten the modeof the Gaussian distribution in the latent space. Thus,the mode-collapse problem in the latent space is dimin-ished. Equations 10 and 11 are defined as LD1

and LD2

respectively.

Reconstruction Loss

In equation 9, we impose a structural difference loss be-tween x1 and x2. We need a reconstruction loss to ensureboth images do not become a trivial solution while min-imizing Ldiff in equation 9. A trivial solution to Ldiffcould be x1 = 1, a matrix of all 1s, and x2 = 0, a matrixof all 0s. To avoid this, we define a reconstruction loss onr as follows:

Lreconst. =1

N

N∑n=1

|x− r|2. (12)

But r = x1 + x2. Hence,

Lreconst. =1

N

N∑n=1

|xn − (x1n + x2n)|2. (13)


G

𝐏!

𝐏"

!(𝑝!"!

− 𝑝!) = 0

!𝑝! .𝑝#!,#!%#

= 0

𝑞#(. )𝐗

L$%&'(

𝐑+L)*++

Factorized Images

D!real or fake

D" Uniform Dist

real orfake

𝐗*𝟏

𝐔

𝒛

𝒛𝟏 𝒛𝟐

𝐗*𝟐

𝑝.(. )

E

Figure 7: Diagram of proposed deep learning module. X is the set of all input images. qφ(.) is the encoder. z is thelatent code factorized into z1, z2. pθ(.) is the decoder that outputs the factorized images: X1 and X2.

Equation 13 ensures both x1 and x2 are valid seismic im-ages and ensures a cycle consistency with the input image.

Lastly, the composite model is trained with losses (8-13) over 300 epochs. After each forward pass, we back-propagate Lreconst to update E, G, P1, P2. LD1

is back-propagated to update models D1 and G. While LD2 isback-propagated to update models D2 and E. Finally,Lproj and Ldiff losses are used to update P1 and P2 si-multaneously.

RESULTS

We show qualitative results on the trained images andgeneralize our predictions on the output images to makepredictions on full vertical seismic sections. In addition,we show how our method can be applied to attribute ex-traction. In our attribute extraction process, we analyzetwo complementary attributes and compare them to sixexisting attributes in literature.

First, each output image from the entire workflow hasboth an assigned class and a pixel label correspondingto a region of interest. In Figure 8, the input image xobtained from our clustering module is passed to the la-tent space model to obtain x1 and x2. In each sub-figureof Figure 8, x1 and x2 are synthesized by the model tobe structurally complementary to each other such thattheir combination reconstructs x. And the L1 norm oftheir difference reveals a geologically meaningful region.In each sub-figure, we show sparse inference regions us-ing confidence values. Each pixel value is a probabilityvalue in [0, 1], where 0 is the probability a pixel is notpart of a geological class and 1 is otherwise. For improvedperformance, we threshold-out probability values below athreshold to obtain a cleaner binary image at the bot-

tom right. An evidence to show that both x1 and x2

are factorized from x, as we hypothesized, is to observetheir mean-squared error (MSE) against input image x. In all four sub-figures, both synthesized images (x1,x2), have higher MSEs than the reconstruction r. Thisimplies that both images are not as similar to the inputimage as the reconstructed one, because they are factor-ized components of the input image. Further evidence isobserved in x1 in Figures 8(a), 8(c), 8(d) and x2 in Fig-ure 8(b). In all these instances, the model generates animage that is unlike any image in our training set. Theorthogonal counterpart of these images are similar to ourinput images with more contrast. Hence, x1 and x2 areequivalent representation of the orthogonal subspaces towhich their latent subsidiaries were projected.

The image label and pixel annotation obtained in Fig-ure 8 can be generalized to label sections of the F3 block toassist with seismic interpretation. We trained a segmen-tation model - DeepLab V3+ (Chen et al., 2017) on alloutput images from our proposed pipeline and applied asoft-max classification layer to make a prediction on eachreflection amplitude to any of the pre-defined four classes.Furthermore, we ran another experiment to determinewhether each amplitude in the seismic section belongs toa structure class. The later experiment fits into an at-tribute extraction framework to be discussed later. In allpredicted pixel labels, we threshold out small probabilityvalues to improve our confidence on the pixel labels. Fig-ure 9 show four levels of threshold values: 0.25, 0.35, 0.45and 0.55.

As shown in Figure 9, images with a cut-off of 0.55 per-formed best on structural classification of the seismic am-plitudes when trained on DeepLab V3+ model. Hence, wereport results based on a threshold of 0.55. Note that the


X X1, MSE=0.0796 X2, MSE=0.0628

R, MSE=0.0016 |X1− X2| Output, thresh=0.15

X X1, MSE=0.0596 X2, MSE=0.0933


(a) Chaotic Region Delineation (b) Salt Region Delineation

X X1, MSE=0.0754 X2, MSE=0.0646


X X1, MSE=0.0839 X2, MSE=0.064


(c) Horizon Lines Delineation (d) Fault Region Delineation

Figure 8: Proposed output images from our latent space factorization model trained on 120x120 patches. In each sub-figure, there are six images. x is the input image. x1 is the image synthesized from projection matrix P1 and likewise x2

is synthesized from P2. r is the reconstructed image. |x1 − x2| is the L1-norm sparse output obtained from minimizingLdiff in equation 9. (a) shows the delineation of chaotic facies in the image. For each output image, we thresholdout pixel values below 0.15. (b) shows a patch from a salt region. Notice the sparse pixel labels identifies interestingstructures in the image. (c) shows parallel reflectors are delineated differently from the regions of smoothed amplitudereflections. (d) shows step-wise amplitude reflectors in the faulted region. Our method does not delineate faults alongthe faulting planes because it is unsupervised, but this regional delineation is sufficient to extend our patch model tosectional volume fault region delineation. Lastly, x1 in (a), (c), (d) and x2 in (b) were synthetically generated by ouralgorithm to satisfy our factorization objective because they are dissimilar to images in our training set.

colorbars in Figure 9 do not reflect the thresholds. Thethresholds are applied to the output images of the latentspace model which are then used to train the segmentationnetwork.

Figure 10, shows the corresponding results for a testsection at inline 350 of the F3 block. Figure 10a identifiesparallel reflectors with dipping amplitude reflections. InFigure 10b, an approximate blurb is shown on the chaoticsedimentary strip of the section. The chaotic region tothe right is not as well delineated as the region on theleft. A possible explanation for this observation is thatmost of the images clustered into the chaotic class weretaken from regions on the left of the F3 block. If thechaotic facies on the right differs from the one on the left,

then this partial delineation is a derived anomaly that maybe corrected by including images from the right chaoticregion into our training set. However, since our method ofmodel training is self-supervised, we report our result as is.Figure 10c identifies the region containing fault structureson the left of the salt dome in the highlighted section. Thehighlighted region on the right of the salt dome is a falsepositive. However, the left identified region is delineatedcorrectly. Note that the model delineates regions withfaults without tracking the fault lines. Delineating regionsof faults helps interpreters locate fault regions from whichmanual tracking of the faults may be done. Althoughsalt domes are challenging to delineate, we show that ourmethod identifies the salt dome boundary with significant


Figure 9: This figure shows effect of four cut-off thresholds on predicted sections. a) with threshold set to 0.25, theprediction is quite noisy. b) when the threshold is set to 0.35, the prediction is less noisy. c) shows further progressiveimprovement in prediction at threshold=0.45. d) shows clear structural prediction when the threshold when set to 0.55.

accuracy in Figure 10d. Other non-salt regions to the leftand right of the salt are lightly highlighted. They arealso false positives but we can dismiss them due to lowprobability values (∼ 0.25) in those regions.

We extended our section-based structural delineation inFigure 10 to 3D volume factorization of the F3 block. Fig-ure 11 shows four facies in four separate classification in-stances in the F3 block. The delineated facies are markedusing a white arrow. Evidence that factorization of thevolume occurs can be seen by the relative absence of otherstructures apart from the classified one. For instance inFigure 11a, only parallel reflectors from the horizon classare shown while the salt dome, fault region, and chaotic fa-cies are factorized out. In Figure 11b, the chaotic facies isdelineated and the parallel reflectors shown in Figure 11bare absent. In fault regions shown in Figure 11c, the de-lineated region the right is a false positive correspondingto the false positive identified region on the right of thesalt dome in Figure 10. However, the left white arrowsidentify the fault region. Note that in Figure 11d, the saltdome is delineated, but the horizons and chaotic regionsare factorized out. The 3D delineation applies mostly tothe boundary of the salt dome structure and correspondsto the regions delineated in Figure 10d.

Attribute Extraction

We further demonstrate that our deep learning modelcan factorize the F3 block into foreground and backgroundattributes, each acting as features to guide the seismic in-terpreter to make informed decisions. The DeepLab seg-mentation model is modified to classify each amplitudein a section into two classes: foreground and background.Hence, every amplitude is mapped to a binary class. Foreach section, we compare the two predicted classes to sixattributes from the literature for qualitative assessment.Gradient of texture (GoT) is a recent seismic attribute

based on the gradients of seismic amplitudes in a movingwindow. In the 3D version, the gradients in the x, y andz directions were measured and combined into one valueper pixel. The 2D version calculates gradients only in thex, y coordinates. The GoT attribute has been applied todelineate the boundaries of salt domes in 2D sections and3D seismic volumes (Shafiq et al., 2015). We comparedour proposed attributes to 2D and 3D GoT.

Phase congruency (Shafiq et al., 2017) is another recentseismic attribute that improved on GoT. Phase congru-ency is a dimensionless quantity and it is unaffected bychanges in image illumination and contrast, unlike GoT.

In addition, we compared our method against a saliency-based seismic attribute (Shafiq et al., 2018a), which viewthe features in the seismic volume in a perception analo-gous to human vision. The component features of saliencyattributes consist of fast Fourier Transform (FFT) coeffi-cients. Similar to 3D GoT, saliency-based attributes canbe computed in 2D and 3D variants. Lastly, we compareagainst gray-level co-occurrence matrices (GLCM) textureattributes (Chopra and Alexeev, 2006) and instantaneousamplitude (White, 1991) attributes.

Figures 12g and 12h are our proposed attributes, whileFigures 12a - 12f are attributes we compared against. Fig-ure 12g is the proposed background attribute and Fig-ure 12h is the proposed foreground attribute. GoT andPhase congruency, in Figure 12a and 12c, show betterdelineation of parallel reflectors than our foreground at-tribute shown in Figure 12h. However, our method showsa less noisy delineation of the parallel reflectors and high-lights the most important ones. Saliency, instantaneousamplitude and 2D-GoT in Figure 12b, 12e, and 12f de-lineates mostly the salt boundary and the fault regions -which implies our framework performs better in delineat-ing parallel reflectors. GLCM at Figure 12d performedleast in delineating structural attributes among all eight


Figure 10: Factorized 2D sections of the F3 Block showing a) horizons, b) chaotic, c) faults and d) salt dome structuresin inline 350 of the F3 block. The white arrows illustrate regions delineated correctly by our framework

Horizon Chaotic

Fault Regions Salt Dome

a).

c).

b).

d).

Figure 11: 3D Delineation of structures in the F3 Block. The z-axis is the time axis, the x-axis is the inline directionand the y-axis points in the crossline direction. a) shows parallel reflector lines delineated, b) shows a vertical sectionacross the chaotic strip delineated, c) shows regions with fault structures delineated, d) shows the salt-dome region regiondelineated by our framework.


a).

g).

f).e).

c). d).

b).

h).

Figure 12: This figure shows eight sub-figures of various attributes extracted from the same F3 block volume. Allattributes are shown for inline section 200. a) is 3D GoT attribute shown for inline 200. b) is the human visual systembased saliency map for the same section, c) shows phase congruency. Note that phase congruency is an improvementto 3D-GoT. d) is Texture-based GLCM. GLCM performs poorly for structural delineation in this application. e) is theInstantaneous amplitude attribute. This attribute identifies the boundary of the salt dome and highlights a few horizonlines. For completion, we include 2D GoT in f). 2D GoT does not reveal many features aside from the edges of the saltdome. g) and h) are our proposed background and foreground-based attributes. Notice that in h), most features we seekare represented. The salt dome boundary, the chaotic strip, and the horizon lines. The fault region is lightly delineated.g) is complementary to h). It shows the background regions that do not belong to features identified as structures.

attributes compared.

Comparison with a Machine Learning Frame-work by Alaudah et al. (2019a)

Our deep learning framework could also be applied tosolve a label-mapping problem similar to the problem solvedby Alaudah et al. (2019a). The author applied a non-negative matrix factorization (NNMF) algorithm to pre-dict pixel labels from image labels. The NNMF factorizesan input matrix X into features and label assignments:

X = WH,

where X is a matrix with images as column entries. W isa features vector and H gives the assignment of featuresin W into their respective classes. The features learned

during factorization were mapped to corresponding im-ages to delineate geological structures. In Figure 13, weattempt to label pixels by mapping image labels learnedfrom our clustering framework to pixel predictions madeby our deep learning model and we compare the resultwith Alaudah et al. (2019a)’s framework.

The author used four classes: chaotic, salt dome, faults,and others, where the latter include images that do notbelong to the previous three classes. Two major differ-ences between our framework and the author’s frameworkare as follows: in the latter, an interpreter assigns imagelabels to a few examples, uses an image retrieval tech-nique to obtain labels for the remaining images which isthen applied to the weakly-supervised framework. Sec-ondly, Alaudah et al. (2019a)’s method is not based on adeep learning framework. In contrast, we do not use im-


Ala

udah

et. a

l.O

urs

Chaotic Horizon SaltFault

Figure 13: We make a class-by-class comparison of our unsupervised model to Alaudah et al. (2019a)’s weakly supervisedmodel.

age labels and we train a deep learning frame for pixel andimage labels. Figure 13 shows four classes of images. Notethat in labeling pixels in all the classes, the accuracy ofour framework is highest along edges while Alaudah et al.(2019a)’s labeling is more region oriented. For instance,our framework assigns the highest confidence values of de-lineated geologic features to edges of peaks-to-troughs inall four classes. This is because we used an L1-norm spar-sity loss in generating the probability maps for predictingstructures. In the chaotic features delineated, the per-formance of both frameworks is close. Because Alaudahet al. (2019a)’s method was not trained on the horizon’sclass specifically, the delineation of structures in this classof images was left out by the algorithm. However, in thesame class, we are able to delineate lines of parallel reflec-tors.

Alaudah et al. (2019a)’s labeling captured the fault re-gion more elegantly than ours, but our method highlightsthe salt regions elaborately than Alaudah et al. (2019a)’smethod which mostly labeled the edges or boundary ofthe salt structure. Salt imaging is challenging due to therugose structure of salt domes, leading to complex am-plitude patterns in salt facies. These complex patternscan conflict with other structural patterns in the featurefactorization of Alaudah et al. (2019a)’s framework lead-ing to relatively poorer performance. Our method focusesattempts to separate interesting features from the sur-rounding background, hence it captures complex patternsbetter than the previous framework.

CONCLUSION

We proposed two frameworks based on hierarchical clus-tering and a deep adversarial model. We showed thatour hierarchical clustering model can be used for unsu-pervised clustering of seismic images. We also showedthat our self-supervised deep learning framework can beused for self-supervised segmentation of geological struc-tures using orthogonal latent space projection. One limi-tation of our self-supervised model is that the delineatedstructures are partial, which leaves room for improvement.Furthermore, the hierarchical clustering part and the self-supervised part are separate modules. In future work, wecould create a comprehensive framework that would notneed a pre-clustering module. The adversarial trainingmethod could also be improved for more stable training.Lastly, the latent space factorization methodology couldbe extended to multiple geological component delineationsin each image compared to the current foreground andbackground-based methodology.

APPENDIX - A PROOF OF EQUATION 7

Here, we provide detailed proof of equation 7. Let X beour input data of images and X = [x1,x2, ...,xN ]. We canassume independence on each xi. Let the distribution ofX be pdata. Assume there exists a decoder/generator fromwhich we can generate x ∈ X. Let the distribution of thisgenerator over some learned parameter θ, be pθ(x). Thelog-likelihood of generating X = Ex∼pdata [log(pθ(x))]. Weintroduce an auxiliary distribution over another learningparameter φ, qφ(x) to approximate pθ(x). We introduceqφ(x) because to reconstruct each x from the likelihood


function, we must know the underlying distribution. Wecan re-write the log-likelihood as:

Ex∼pdata [log (pθ(x))]

= Ex∼pdata

[log

(∑z

pθ(x, z)

)](14)

= Ex∼pdata

[log

(∑z

pθ(x|z).pθ(z)

)], (15)

but the posterior pθ(x|z) is intractable, hence we use anauxiliary distribution, qφ(x), to estimate pθ(x|z). Notethat we introduce a prior z into equation 15. From anencoder-decoder framework, the prior z is the latent spacevariable passed into the decoder/generator. We know thedistribution of z is Gaussian with zero mean, unit variancedue to batch normalization at the end of the encoder.Hence, we re-write equation 15 as:

Ex∼pdata

[log∑z

(pθ(x, z))

](16)

we can condition the likelihood on z1, z2 without loss ofgenerality:

= Ex∼pdata

[log

(∑z1

∑z2

(pθ(x|z1, z2)).pθ(z1, z2)

)].

(17)Next, we introduce qφ(z1|x), qφ(z2|x) into (17) follows:

= Ex∼pdata

[log

(∑z1

∑z2

qφ(z1|x).qφ(z2|x)

× pθ(z1).pθ(z2)

qφ(z1|x).qφ(z2|x).pθ(x|z1, z2)

)] (18)

Simplifying,

(18) = Ex∼pdata

log

∑z1∼pθ(z1)

p(z1)

qφ(z1|x)

×∑

z2∼pθ(z2)

p(z2)

qφ(z2|x).pθ(x|z1, z2).qφ(x|z1, z2)

(19)

Applying Jensen’s inequality,

(19) ≥ Ex∼pdata

[Ez1∼qφ(z1|x) log

(pθ(z1)

qφ(z1|x)

)+Ez2∼q(z2|x) log

(pθ(z2)

qφ(z2|x)

)+Eqφ(z1,z2|x) log(pθ(x|z1, z2))

](20)

Now we can write each term in their KL equivalence.

(20) = Ex∼pdata [−KL(qφ(z1|x)||pθ(z1))

−KL(qφ(z2|x)||pθ(z2))

+ Eq(z1,z2|x) log(pθ(x|z1, z2))] (21)

We can re-arrange equation 21 in entropy terms:

= Ex∼pdata


+H(qφ(z1|x)) +H(qφ(z2|x))

+Eqφ(z1) log (pθ(z1))

+Eqφ(z2) log (pθ(z2))] (22)

Further re-arranging, we arrive at the same form as equa-tion 7:

= Ex∼pdata


+H(qφ(z1|x)) +H(qφ(z2|x))

−H (pθ(z1))−H (pθ(z2))] .

(23)

This proves equation (7).

REFERENCES

Alaudah, Y., M. Alfarraj, and G. AlRegib, 2019a, Struc-ture label prediction using similarity-based retrieval andweakly supervised label mapping: GEOPHYSICS, 84,no. 1, V67–V79.

Alaudah, Y., and G. AlRegib, 2016, Weakly-supervisedlabeling of seismic volumes using reference exemplars:2016 IEEE International Conference on Image Process-ing (ICIP), IEEE, 4373–4377.

Alaudah, Y., S. Gao, and G. AlRegib, 2018, Learningto label seismic structures with deconvolution networksand weak labels, in SEG Technical Program ExpandedAbstracts 2018: Society of Exploration Geophysicists,2121–2125.

Alaudah, Y., P. Micha lowicz, M. Alfarraj, and G. Al-Regib, 2019b, A machine-learning benchmark for faciesclassification: Interpretation, 7, no. 3, SE175–SE187.

Alaudah, Y., M. Soliman, and G. AlRegib, 2019c, Faciesclassification with weak and strong supervision: A com-parative study, in SEG Technical Program ExpandedAbstracts 2019: Society of Exploration Geophysicists,1868–1872.

Alfarraj, M., Y. Alaudah, and G. AlRegib, 2016, Content-adaptive non-parametric texture similarity measure:2016 IEEE 18th International Workshop on Multime-dia Signal Processing (MMSP), IEEE, 1–6.

Alfarraj, M., and G. AlRegib, 2019, Semi-supervised se-quence modeling for elastic impedance inversion: Inter-pretation, 7, no. 3, 1–65.

Araya-Polo, M., T. Dahlke, C. Frogner, C. Zhang, T. Pog-gio, and D. Hohl, 2017, Automated fault detection with-out seismic processing: The Leading Edge, 36, no. 3,208–214.

Babakhin, Y., A. Sanakoyeu, and H. Kitamura, 2019,Semi-supervised segmentation of salt bodies in seismicimages using an ensemble of convolutional neural net-works: German Conference on Pattern Recognition,Springer, 218–231.

Barnes, A. E., 1992, The calculation of instantaneous fre-quency and instantaneous bandwidth: Geophysics, 57,no. 11, 1520–1524.


Barnes, A. E., and K. J. Laughlin, 2002, Investigation ofmethods for unsupervised classification of seismic data,in SEG Technical Program Expanded Abstracts 2002:Society of Exploration Geophysicists, 2221–2224.

Bedle, H., 2019, Seismic attribute enhancement of weakand discontinuous gas hydrate bottom-simulating re-flectors in the pegasus basin, new zealand: Interpre-tation, 7, no. 3, SG11–SG22.

Chen, L.-C., G. Papandreou, I. Kokkinos, K. Murphy, andA. L. Yuille, 2017, Deeplab: Semantic image segmenta-tion with deep convolutional nets, atrous convolution,and fully connected conditional random fields: IEEEtransactions on pattern analysis and machine intelli-gence, 40, no. 4, 834–848.

Chen, Q., and S. Sidney, 1997, Seismic attribute tech-nology for reservoir forecasting and monitoring: TheLeading Edge, 16, no. 5, 445–448.

Chopra, S., and V. Alexeev, 2006, Applications of tex-ture attribute analysis to 3D seismic data: The LeadingEdge, 25, no. 8, 934–940.

Chopra, S., and K. J. Marfurt, 2005, Seismic attributes—ahistorical perspective: Geophysics, 70, no. 5, 3–28.

Das, V., A. Pollack, U. Wollner, and T. Mukerji, 2018,Convolutional neural network for seismic impedanceinversion, in SEG Technical Program Expanded Ab-stracts 2018: Society of Exploration Geophysicists,2071–2075.

——–, 2019, Effect of rock physics modeling in impedanceinversion from seismic data using convolutional neu-ral network: The 13th SEGJ International Symposium,Tokyo, Japan, 12-14 November 2018, Society of Explo-ration Geophysicists, 522–525.

Di, H., and G. AlRegib, 2019, Semi-automaticfault/fracture interpretation based on seismic geome-try analysis: Geophysical Prospecting, 67, no. 5, 1379–1391.

Di, H., M. Shafiq, and G. AlRegib, 2018a, Multi-attributek-means clustering for salt-boundary delineation fromthree-dimensional seismic data: Geophysical JournalInternational, 215, no. 3, 1999–2007.

——–, 2018b, Patch-level mlp classification for improvedfault detection, in SEG Technical Program ExpandedAbstracts 2018: Society of Exploration Geophysicists,2211–2215.

Di, H., M. A. Shafiq, Z. Wang, and G. AlRegib, 2019,Improving seismic fault detection by super-attribute-based classification: Interpretation, 7, no. 3, SE251–SE267.

Di, H., Z. Wang, and G. AlRegib, 2018c, Deep convolu-tional neural networks for seismic salt-body delineation:Presented at the AAPG Annual Convention and Exhi-bition.

Dramsch, J. S., and M. Luthje, 2018, Deep-learning seis-mic facies on state-of-the-art cnn architectures, in SEGTechnical Program Expanded Abstracts 2018: Societyof Exploration Geophysicists, 2036–2040.

Dubrovina, A., F. Xia, P. Achlioptas, M. Shalah, R.

Groscot, and L. J. Guibas, 2019, Composite shape mod-eling via latent space factorization: Proceedings of theIEEE International Conference on Computer Vision,8140–8149.

Ester, M., H.-P. Kriegel, J. Sander, X. Xu, et al., 1996, Adensity-based algorithm for discovering clusters in largespatial databases with noise.: KDD, 226–231.

Gao, Y., A. Rehman, and Z. Wang, 2011, Cw-ssim basedimage classification: 2011 18th IEEE International Con-ference on Image Processing, IEEE, 1249–1252.

Goodfellow, I., J. Pouget-Abadie, M. Mirza, B. Xu, D.Warde-Farley, S. Ozair, A. Courville, and Y. Bengio,2014, Generative adversarial nets: Advances in neuralinformation processing systems, 2672–2680.

Kingma, D. P., and M. Welling, 2013, Auto-encoding vari-ational Bayes: arXiv preprint arXiv:1312.6114.

Li, K., Z. Liu, B. She, G. Hu, and C. Song, 2019, Or-thogonal deep autoencoders for unsupervised seismicfacies analysis, in SEG Technical Program ExpandedAbstracts 2019: Society of Exploration Geophysicists,2023–2028.

Liu, M., W. Li, M. Jervis, and P. Nivlet, 2019, 3d seismicfacies classification using convolutional neural networkand semi-supervised generative adversarial network, inSEG Technical Program Expanded Abstracts 2019: So-ciety of Exploration Geophysicists, 4995–4999.

Qian, F., M. Yin, M.-J. Su, Y. Wang, and G. Hu, 2017,Seismic facies recognition based on prestack data us-ing deep convolutional autoencoder: arXiv preprintarXiv:1704.02446.

Shafiq, M. A., Y. Alaudah, H. Di, and G. AlRegib, 2017,Salt dome detection within migrated seismic volumesusing phase congruency, in SEG Technical Program Ex-panded Abstracts 2017: Society of Exploration Geo-physicists, 2360–2365.

Shafiq, M. A., T. Alshawi, Z. Long, and G. AlRegib,2018a, The role of visual saliency in the automationof seismic interpretation: Geophysical Prospecting, 66,no. S1, 132–143.

Shafiq, M. A., H. Di, and G. AlRegib, 2018b, A novel ap-proach for automated detection of listric faults withinmigrated seismic volumes: Journal of Applied Geo-physics, 155, 94–101.

Shafiq, M. A., Z. Wang*, A. Amin, T. Hegazy, M. Deriche,and G. AlRegib, 2015, Detection of salt-dome bound-ary surfaces in migrated seismic volumes using gradi-ent of textures, in SEG Technical Program ExpandedAbstracts 2015: Society of Exploration Geophysicists,1811–1815.

Starck, J.-L., E. J. Candes, and D. L. Donoho, 2002, Thecurvelet transform for image denoising: IEEE Transac-tions on image processing, 11, no. 6, 670–684.

Taner, M. T., J. S. Schuelke, R. O’Doherty, and E. Baysal,1994, Seismic attributes revisited, in SEG TechnicalProgram Expanded Abstracts 1994: Society of Explo-ration Geophysicists, 1104–1106.

Wang, Z., A. C. Bovik, H. R. Sheikh, and E. P. Simoncelli,


2004, Image quality assessment: from error visibilityto structural similarity: IEEE transactions on imageprocessing, 13, no. 4, 600–612.

White, R. E., 1991, Properties of instantaneous seismicattributes: The Leading Edge, 10, no. 7, 26–32.

Wu, X., L. Liang, Y. Shi, and S. Fomel, 2019, Faultseg3d:Using synthetic data sets to train an end-to-end convo-lutional neural network for 3d seismic fault segmenta-tion: Geophysics, 84, no. 3, IM35–IM45.

Wu, X., Y. Shi, S. Fomel, and L. Liang, 2018, Convolu-tional neural networks for fault interpretation in seismicimages, in SEG Technical Program Expanded Abstracts2018: Society of Exploration Geophysicists, 1946–1950.

Xiong, W., X. Ji, Y. Ma, Y. Wang, N. M. AlBinHassan,M. N. Ali, and Y. Luo, 2018, Seismic fault detectionwith convolutional neural network: Geophysics, 83, no.5, O97–O103.

Zhao, S., J. Song, and S. Ermon, 2017, Towards deeper un-derstanding of variational autoencoding models: arXivpreprint arXiv:1702.08658.

Zujovic, J., T. N. Pappas, and D. L. Neuhoff, 2013, Struc-tural texture similarity metrics for image analysis andretrieval: IEEE Transactions on Image Processing, 22,no. 7, 2545–2558.

arxiv:2108.09605v1 [physics.geo-ph] 22 aug 2021 mailto

Documents