indexing semistructured data

15
Indexing Semistructured Data J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman Stanford University January 1998 http://www-db.stanford.edu/lore/ EECS 684 02/21/2000 Presented by Weiming Zhou

Upload: zaynah

Post on 25-Feb-2016

33 views

Category:

Documents


1 download

DESCRIPTION

Indexing Semistructured Data. J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman Stanford University January 1998 http://www-db.stanford.edu/lore/. EECS 684 02/21/2000 Presented by Weiming Zhou . Outline. Introduction - Data Model - Query Language - PowerPoint PPT Presentation

TRANSCRIPT

Page 1: Indexing Semistructured Data

Indexing Semistructured Data

J. McHugh, J. Widom, S. Abiteboul, Q. Luo, and A. Rajaraman

Stanford University January 1998

http://www-db.stanford.edu/lore/

EECS 684 02/21/2000 Presented by Weiming Zhou

Page 2: Indexing Semistructured Data

Outline

• Introduction - Data Model - Query Language• Indexes in Lore• Query plans using indexes• Conclusions

Page 3: Indexing Semistructured Data

Data Model - Object Exchange Model (OEM)

Page 4: Indexing Semistructured Data

The Lorel Query Language (Lorel)

Example 1select DB.Movie.Titlewhere DB.Movie.Actor.Name = “Harrison Ford”

Example 2select Tfrom DB.Movie M, M.Title Twhere exists A in M.Actor : exists N in A.Name

: N = “Harrison Ford”

Page 5: Indexing Semistructured Data

Indexes In Lore

• Value index• Text index• Link index• Path index• Edge index

Page 6: Indexing Semistructured Data

Value index

Similar to attribute indexes in Relational DBMS

Example

Suppose we create a Value index for DB.Movie.Year

If we perform a lookup for DB.Movie.Year = “1956”, Result: &12.

Page 7: Indexing Semistructured Data

Text Index

• An information-retrieval style keyword search.• Restricted by incoming labels.• Locates string values containing specific words.• Useful for strings containing a significant amount of text.

Implementation:Inverted lists - map a given word w and label l to a list of atomic values with incoming edge l that contain word w.

Example: Lookup for all objects with an atomic string value containing theword “Ford" and an incoming edge Name.Results: {<&17, 2>, <&21, 2>}.

Page 8: Indexing Semistructured Data

Link Index

• Locates parents of a given object.• Serves as back-pointers

Implementation• Extendible hashing• One Link Index for the entire database graph

Example The Link Index lookup for object &17 returns parent object &6, and the lookup for object &21 returns object &13.

Page 9: Indexing Semistructured Data

Path Index

Locate all objects reachable by a given labeled path.

Provided by DataGuide.

Exampleselect DB.Movie.Title Using the Path Index to directly locate all objects reachable via DB.Movie.Title.

Results: &5; &9; &14.

Page 10: Indexing Semistructured Data

Edge Index

All parent-child pairs connected via a specified label.

Example

Look up label “Year” in Edge Index

Results: &2-&7, &3-&12

Page 11: Indexing Semistructured Data

Query Plans Using Indexes

• Top-Down• Bottom-Up• Hybrid

Example select Tfrom DB.Movie M, M.Title Twhere exists A in M.Actor : exists N in A.Name

: N = “Harrison Ford”

Page 12: Indexing Semistructured Data

Top-Down Query Plan

Exhaustive Top-down traversalsDB.Movie.Actor.Name = “Harrison Ford” &17, &21 Link Index &17 &2, &21 &4DB.Movie.Title &5, &14

Page 13: Indexing Semistructured Data

Bottom-Up Query Plan

Look up Value Index DB.Movie.Actor.Name = “Harrison Ford” &17, &21Link Index &17 &2, &21 &4DB.Movie.Title &5, &14

Page 14: Indexing Semistructured Data

Hybrid Query Plan

select Xfrom A.B Xwhere exists Y in X.C : Y =5

Bottom-up: Value Index A.B.C = “5”

Top-down: A.B

Intersect

Page 15: Indexing Semistructured Data

Conclusions

• Presents Lore’s indexing structures: Value

Index, Text Index, Link Index, Path Index

and Edge Index.

• Query plans using indexes

• Preliminary performance results:

at least an order of magnitude improvement

when indexes are used for query processing.