![Page 1: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/1.jpg)
Structural Knowledge Discovery Used to Analyze Earthquake Activity
Jesus A. GonzalezLawrence B. HolderDiane J. Cook
![Page 2: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/2.jpg)
MOTIVATION AND GOAL
Need to analyze large amounts of information in
real world databases.
Information that standard tools can not detect.
Earthquake Database.
Previous knowledge: Spatio-Temporal relations.
![Page 3: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/3.jpg)
SUBDUE KNOWLEDGE DISCOVERY SYSTEM
SUBDUE discovers patterns (substructures) in structural data sets.
SUBDUE represents data as a labeled graph.
Inputs: Vertices and Edges.
Outputs: Discovered patterns and instances.
![Page 4: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/4.jpg)
EXAMPLE
objecttriangle
object
squareon
shape
shape
Vertices: objects or attributesEdges: relationships
4 instances of
![Page 5: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/5.jpg)
EVALUATION CRITERION
Minimum Encoding.
Graph Compression.
Substructure Size (Tried but did not work).
![Page 6: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/6.jpg)
EVALUATION CRITERIONMINIMUM DESCRIPTION LENGTH
Minimum Description Length (MDL) principle. The best theory to describe a set of data is the one that minimizes the DL of the entire data set.
DL of the graph: the number of bits necessary to completely describe the graph.
Search for the substructure that results in the maximum compression.
![Page 7: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/7.jpg)
THE EARTHQUAKE DATABASE
Several catalogs.
Sources like the National Geophysical Data Center.
Each record with 35 fields describing the earthquake characteristics.
![Page 8: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/8.jpg)
THE EARTHQUAKE DATABASEKNOWLEDGE REPRESENTATION
EVENT 2
EVENT 1
EVENT 3
EVENT m
PDE_W
1998
01
4.5
Near_in_distance
Near_in_time
Category
Year
Month
Magnitude
![Page 9: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/9.jpg)
THE EARTHQUAKE DATABASEPRIOR KNOWLEDGE
Connections between events where its epicenters were close to each other in distance (<= 75 kilometers).
Connections between events that happened close to each other in time (<= 36 hours).
Spatio-Temporal relations represented with “near_in_distance” and “near_in_time” edges.
![Page 10: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/10.jpg)
Geologist Dr. Burke Burkart. Study of seismology caused by the Orizaba Fault. Fault: A fracture in a surface where a displacement of
rocks also happened. Selection of the area of study, two squares:
First Longitude 94.0W through 101.0W and Latitude
17.0N through 18.0N. Second Longitude 94.0W through 98.0W and Latitude
18.0N through 19.0N.
DETERMINING EARTHQUAKE ACTIVITY
![Page 11: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/11.jpg)
DETERMINING EARTHQUAKE ACTIVITY
Area of Study
![Page 12: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/12.jpg)
DETERMINING EARTHQUAKE ACTIVITY
Divide the area in 44 rectangles of one half of a degree in
both longitude and latitude.
Sample the earthquake activity in each sub-area.
Run Subdue in each sub-area.
![Page 13: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/13.jpg)
DETERMINING EARTHQUAKE ACTIVITY
Area CoordinatesAreaNumber
Latitude Longitude
AreaName
Number ofEvents
1 101.0W 100.5W 17.0N 17.5N Gue1 622 101.0W 100.5W 17.5N 18.0N Gue2 403 100.5W 100.0W 17.0N 17.5N Gue3 574 100.5W 100.0W 17.5N 18.0N Gue4 135 100.0W 99.5W 17.0N 17.5N Gue5 716 100.0W 99.5W 17.5N 18.0N Gue6 157 99.5W 99.0W 17.0N 17.5N Gue7 358 99.5W 99.0W 17.5N 18.0N Gue8 169 99.0W 98.5W 17.0N 17.5N Gue9 1310 99.0W 98.5W 17.5N 18.0N Gue10 14
26 95.0W 94.5W 17.5N 18.0N Ver1 4327 94.5W 94.0W 17.0N 17.5N Oaxver4 3528 94.5W 94.0W 17.5N 18.0N Ver2 2329 98.0W 97.5W 18.0N 18.5N Pue1 630 98.0W 97.5W 18.5N 19.0N Pue2 0
42 95.0W 94.5W 18.5N 19.0N Vergolf5 143 94.5W 94.0W 18.0N 18.5N Vergolf4 344 94.5W 94.0W 18.5N 19.0N Vergolf6 1
![Page 14: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/14.jpg)
DETERMINING EARTHQUAKE ACTIVITY
33.00
Substructure 2, 8 instances.
Sub_1
N %
Depth Dept_ctl Coord_qual..
PDE
Substructure 1, 19 instances.
Event EventNear_in_distance
Category
PDE
Category
61.00 61.00
Region_numberRegion_number
Substructure 1 (with 19 instances) and substructure 2 (with
8 instances) found in sub-area 26.
![Page 15: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/15.jpg)
DETERMINING EARTHQUAKE ACTIVITY
This pattern might give us information about the cause of
the earthquakes.
Subduction also affects this area but it affects at a specific
depth according to the closeness to the Pacific Ocean.
![Page 16: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/16.jpg)
SUBDUE’S POTENTIAL
Subdue finds not only shared characteristics of events, but
also space relations between them. Dr. Burke Burkart is studying the patterns to give direction
to this research. Expect to find patterns representing parts of the paths of
the involved fault. Time relations not considered by Subdue.
Earthquake’s characteristics. Important for other areas.
![Page 17: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/17.jpg)
CONCLUSION
Subdue successful in real world databases. Subdue used prior knowledge to guide search with
temporal and spatial relations. Subdue discovered interesting patterns using these
temporal and spatial relations. Subdue is being used as the data mining tool to study the
“Orizaba Fault” in Mexico.
![Page 18: Structural Knowledge Discovery Used to Analyze Earthquake Activity](https://reader036.vdocuments.site/reader036/viewer/2022062519/56814e5b550346895dbbf700/html5/thumbnails/18.jpg)
FUTURE WORK
Concept Learning Subdue
Theoretical analysis.
Bounds on complexity (e.g. PAC learning).
Graphic User Interface to visualize substructures and their
instances.