the multimedia semantic web
DESCRIPTION
The Multimedia Semantic Web. Bill Grosky Multimedia Information Systems Laboratory University of Michigan-Dearborn Dearborn, Michigan. Contents. Introduction CBR – Where are we? Multimedia annotation Context-rich environments Semantic web Our work Anglograms Finding latent semantics - PowerPoint PPT PresentationTRANSCRIPT
![Page 1: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/1.jpg)
The Multimedia Semantic WebBill Grosky
Multimedia Information Systems LaboratoryUniversity of Michigan-Dearborn
Dearborn, Michigan
![Page 2: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/2.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 3: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/3.jpg)
CBR – Where are We? Development of feature-based techniques for
content-based retrieval is a mature area, at least for images
CBR researchers should now concentrate on extracting semantics from multimedia documents so that retrievals using concept-based queries can be tailored to individual users The semantic gap
(Semi)-automated multimedia annotation
![Page 4: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/4.jpg)
Multimedia Annotation
Multimedia annotations should be semantically rich Multiple semantics
A social theory based on how multimedia information is used
This can be discovered by placing multimedia information in a natural, context-rich environment
![Page 5: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/5.jpg)
Context-Rich Environments
Structural context – Author’s contribution Document’s author places semantically
similar pieces of information close to each other
User can cluster together semantically similar pieces of information
Dynamic context – User’s contribution Short browsing sub-paths are semantically
coherent
![Page 6: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/6.jpg)
Context-Rich Environments
The WEB is a perfect example of a context-rich environment
Develop multimedia annotations through cross-modal techniques Audio Images Text Video
![Page 7: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/7.jpg)
Semantic Web This program overlaps another very important
current research topic, the semantic web Web page annotations are the backbone of this
research effort We have something very important to offer to this
area Multimedia documents Deriving multiple semantics for a single document
Combining our efforts will enrich both communities
![Page 8: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/8.jpg)
Semantic Web
“The Semantic Web is a new initiative to transform the web into a structure that supports more intelligent querying and browsing, both by machines and by humans. This transformation is to be supported through the generation and use of metadata constructed via web annotation tools using user-defined ontologies that can be related to one another.”
Somewhere on the web
![Page 9: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/9.jpg)
Semantic Web
x C D
Web-Page AnnotationTool
Ontology ConstructionTool
End User
Community Portal
InferenceEngine
Metadata RepositoryAnnotated Web Pages
Ontology Articulation Toolkit
Ontologies
Agents
Based on www.semanticweb.org
![Page 10: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/10.jpg)
Semantic Web
Plan a vacation within the next month Bill instructed his semantic web agent through
his handheld browser. An agent retrieved Bill’s vacation profile from his
travel agent, retrieved Bill’s availability from his calendar, checked availability of airlines, hotels and restaurants, and made all the necessary arrangements.
![Page 11: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/11.jpg)
Semantic Web
Multimedia semantic web Plan a vacation close to where
is being exhibited.
![Page 12: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/12.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 13: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/13.jpg)
Anglograms
Image object Entire image Some meaningful portion of an image
semcon Point-based features
corner points color histograms
![Page 14: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/14.jpg)
Anglograms
Point feature mapfor shape
![Page 15: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/15.jpg)
Anglograms
Point feature mapfor color
![Page 16: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/16.jpg)
Anglograms
Voronoi diagram of n = 18 sites
![Page 17: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/17.jpg)
Anglograms
Dual graph of a Voronoidiagram
Delaunay triangulation ofn = 18 sites
![Page 18: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/18.jpg)
Anglograms
Delaunay triangulation of a set of n points O(n log n) algorithm
Invariance of Delaunay triangles of a set of points to translation rotation scaling
![Page 19: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/19.jpg)
Anglograms
Spatial layout of point set Anglogram
Computed by discretizing and counting the angles of the Delaunay triangles
Which angles are counted? O(max(n #bins)) algorithm
What is bin size?
![Page 20: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/20.jpg)
A set of 26 points
Delaunay triangulations of the point set and its two transformed variants
![Page 21: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/21.jpg)
Anglograms
Computation of color anglogram of an image Divide image evenly into a number of M*N
non-overlapping blocks Each individual block is abstracted as a
unique feature point labeled with its spatial location and dominant colors
![Page 22: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/22.jpg)
Anglograms
Computation of color anglogram of an image Point feature map
Normalized feature points, after adjusting any two neighboring feature points to a fixed distance
Construct Delaunay triangulation for each set of feature points labeled with identical color
![Page 23: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/23.jpg)
Anglograms
Computation of color anglogram of an image Compute anglogram based on each Delaunay
triangulation Color anglogram for image
Concatenating all the anglograms together
![Page 24: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/24.jpg)
Anglograms
Pyramid image
![Page 25: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/25.jpg)
Anglograms
![Page 26: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/26.jpg)
Anglograms
Hue component
![Page 27: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/27.jpg)
Anglograms
Saturation component
![Page 28: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/28.jpg)
Anglograms
Point feature map
![Page 29: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/29.jpg)
Anglograms
Feature points ofhue 2
![Page 30: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/30.jpg)
Anglograms
Delaunay triangulationof hue 2
![Page 31: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/31.jpg)
Anglograms
Delaunay triangulationof saturation 5
![Page 32: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/32.jpg)
Anglograms
Anglogram
0102030405060
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19
Bin number
Num
ber o
f ang
les
Anglogram of saturation 5
![Page 33: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/33.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 34: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/34.jpg)
Finding Latent Semantics
We want to transform low-level features to a higher level of meaning
Used for dimension reduction in QBIC Searching in high-dimensional spaces
More importantly, it creates clusters of co-occurring features So-called concepts
![Page 35: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/35.jpg)
Finding Latent Semantics Latent Semantic Analysis (LSA) was introduced
to overcome a fundamental problem in textual information retrieval
Users want to retrieve on the basis of conceptual content Individual words provide unreliable evidence about
conceptual meanings Synonymy
Many ways to refer to the same object Polysemy
Most words have more than one distinct meaning
![Page 36: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/36.jpg)
Finding Latent Semantics
Searching for documents concerning automobiles Tend to use the key-word automobile
A statistical analysis determines that the key-words automobile and car tend to co-occur
LSA will retrieve documents in which the key-word car appears, but not the key-word automobile
![Page 37: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/37.jpg)
Finding Latent Semantics
Term-document association It is assumed that there exists some underlying latent
semantic structure in the data that is partially obscured by the randomness of term choice
By semantic structure we mean the correlation structure in which individual terms appear in documents
Semantic implies only the fact that terms in a document may be taken as referents to the document itself or to its topic
Statistical techniques are used to estimate this latent semantic structure, and to get rid of obscuring noise
![Page 38: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/38.jpg)
Finding Latent Semantics Singular-value decomposition (SVD)
Take a large matrix of term-document association Construct a semantic space wherein terms and documents that
are closely associated are placed near to each other SVD allows the arrangement of space to reflect the major
associative patterns and ignore smaller, less important influence As a result, terms that did not actually appear in a document
may still end up close to the document, if that is consistent with the major patterns of association
Position in the space serves as the semantic indexing Retrieval proceeds by using the terms in a query to identify a
point in the semantic space, and documents in its neighborhood are returned as relevant results
![Page 39: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/39.jpg)
Finding Latent Semantics
Term-document matrix d documents t terms Represented by a t d term-document matrix
A Each document is represented by a column
document vector Each term is represented by a row
term vector
![Page 40: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/40.jpg)
Finding Latent SemanticsThe terms (t = 6)
t1: bak(e,ing) t2: recipes t3: bread t4: cake t5: pastr(y,ies) t6: pie
The document titles (d = 5) d1: How to Bake Bread Without Recipes d2: The Classic Art of Viennese Pastry d3: Numerical Recipes: The Art of Scientific Computing d4: Breads, Pastries, Pies and Cakes: Quantity Baking Recipes d5: Pastry: A Book of Best French Recipes
![Page 41: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/41.jpg)
Finding Latent Semantics
000100101100010100101011110010
A
04082.00007071.04082.001004082.000004082.0005774.0
7071.04082.0105774.004082.0005774.0
A
![Page 42: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/42.jpg)
Finding Latent Semantics
SVD is a dimension reduction technique Reduced-rank approximation to both column
space and row space Find a rank-k approximation to matrix A with
minimal change to that matrix for a given value of k
This decomposition exists for any matrix A
![Page 43: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/43.jpg)
Finding Latent Semantics SVD of a term-document matrix A
A = U VT
A is t d U is a t r orthogonal matrix, where r is rank(A)
The columns of U are a basis for the column space of A U is the matrix of eigenvectors of the matrix AAT
is an r r diagonal matrix having singular values 1 2 … r of A in order along its diagonal
2 is the matrix of eigenvalues of AAT or ATA VT is a r d orthogonal matrix
The rows of VT are a basis for the row space of A V is the matrix of eigenvectors of the matrix ATA
![Page 44: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/44.jpg)
Finding Latent Semantics
t d t r r r r d
![Page 45: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/45.jpg)
Finding Latent Semantics
A special rank-k approximation, Ak
Ak = Uk k VkT
Uk First k columns of U
k First k diagonal values of
VkT
First k rows of VT
![Page 46: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/46.jpg)
Finding Latent Semantics
04082.00007071.04082.001004082.000004082.0005774.0
7071.04082.0105774.004082.0005774.0
A
7071.007071.0000
06394.02774.00127.01182.001158.00838.08423.05198.006394.02774.00127.01182.0
7071.02847.05308.02567.02670.000816.05249.03981.07479.07071.02847.05308.02567.02670.0
U
000000001004195.0000008403.0000001158.1000006950.1
7071.00577.03712.02815.05288.006571.05711.00346.04909.05000.01945.06247.03568.04412.05000.02760.00998.07549.03067.006715.03688.04717.04366.0
V
![Page 47: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/47.jpg)
Finding Latent Semantics
Reduce the rank to 3
04082.00007071.04082.001004082.000004082.0005774.0
7071.04082.0105774.004082.0005774.0
A
0155.02320.00522.00740.01801.07043.04402.00094.09866.00326.00155.02320.00522.00740.01801.00069.04867.00232.00330.04971.0
7091.03858.09933.00094.06003.00069.04867.00232.00330.04971.0
3A
![Page 48: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/48.jpg)
Finding Latent Semantics Documents w/o SVD
Term 1 2 3 4Mark 15 0 0 0Twain 15 0 20 0Samuel 0 10 5 0Clemens 0 20 10 0Purple 0 0 0 20Lion 0 0 0 15
30 0 20 0
Query
Score
110000
![Page 49: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/49.jpg)
Finding Latent SemanticsDocument with SVD
Term 1 2 3 4Mark 3.7 3.5 5.5 0Twain 11.0 10.3 16.1 0Samuel 4.1 3.9 6.1 0Clemens 8.3 7.8 12.2 0Purple 0 0 0 20Lion 0 0 0 15
14.7 13.8 21.6 0
Query
Score
110000
![Page 50: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/50.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 51: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/51.jpg)
Using Text for Improved Image Search
10 sets of 5 similar images
![Page 52: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/52.jpg)
Using Text for Improved Image Search
Color anglogram Each image is divided into 64 non-
overlapping blocks Extract average hue and average saturation values of each
block Hue and saturation each quantized into 10 values Generate Delaunay triangles for each hue value and each
saturation value Count two largest angles and quantize them into 36 bins,
each of 5° Feature vector has 720 elements
![Page 53: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/53.jpg)
Using Text for Improved Image Search
Annotations Extra 15 elements
Category positions sky, sun, land, water, boat, grass, horse, rhino, bird,
human, pyramid, column, tower, sphinx, snow
Each image annotated with appropriate keywords and the area coverage of each of these keywords
e.g., sky (0.55), sun (0.15), water (0.30)
![Page 54: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/54.jpg)
Using Text for Improved Image Search
Raw color global histogram data
Raw color global histogram data using LSA
Annotated color global histogram data using LSA
0.3% improvement
0.5% improvement
![Page 55: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/55.jpg)
Using Text for Improved Image Search
Raw color anglogram data
Raw color anglogram data using LSA
Annotated color anglogram data using LSA
0.5% improvement
1% improvement
![Page 56: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/56.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 57: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/57.jpg)
Using Images for Improved Text Search
Using documents collected from news Web sites News headlines are often used as URL anchors and
document titles Topic can be represented easily and clearly by a
group of keywords in the headline News web sites often have extensive coverage of the
same topic during certain period of time News documents often include multimedia
components which are closely related to the topic
![Page 58: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/58.jpg)
Using Images for Improved Text Search
Discover the semantic correlation between keywords and image in the same document
A collection of 20 documents from cnn.com 4 semantic categories of 5 documents each 43 keywords Select 1 image from each document
Color anglogram
![Page 59: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/59.jpg)
Using Images for Improved Text Search
1. Bush, in first address as president2. Education, tax cuts top Bush's Washington agenda3. Campaign promises could prove troublesome for Bush4. Bush's to-do list: Set tone for next four years5. George W. Bush: The 43rd President6. Rescue mission for crippled Russian sub enters second day7. Russian official says chances not good for rescue of trapped crew aboard sunken nuclear sub8. Kursk salvage raises questions9. Russia to start recovering Kursk bodies10. Russian navy begins attempt to evacuate sailors from sunken sub11. Clinton acquitted; president apologizes again12. Clinton apologizes to nation13. Clinton's evolving apology for the Lewinsky affair14. Clinton will not address impeachment in State of the Union15. Clinton says 'presidents are people, too'16. MIR prepares for risky plunge17. Mir positioned for fiery descent18. A Mir risk19. Mir demise causes international high anxiety20. New Zealand issues Mir warning
![Page 60: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/60.jpg)
Using Images for Improved Text Search
![Page 61: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/61.jpg)
Using Images for Improved Text Search
Integrated feature vector F = [f1, f2,…, f143]T
Textual feature vector K = [k1, k2, …, k43]T
Image feature vector I = [i1, i2, …, i100]T
Feature document matrix A = [F1, F2, …, F20] A = UΣVT
U is 143 143, Σ is 143 20, and V is 20 20 k = 12
Ak = UkΣkVkT
Uk is 143 12, Σk is 12 12, and Vk is 20 12
![Page 62: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/62.jpg)
Using Images for Improved Text Search
Each image is normalized to 192 128, and then divided into 64 non-overlapping blocks
Extract average hue and saturation values of each block
Hue and saturation each quantized into 10 values
Generate Delaunay triangles for each hue value and each saturation value
![Page 63: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/63.jpg)
Using Images for Improved Text Search
Count two largest angles and quantize them into 36 bins, each of 5°
Image feature vector has 720 elements Feature document matrix A is 763 20
SVD k = 12
![Page 64: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/64.jpg)
Using Images for Improved Text Search
Keywords only
Keywords using LSA
Image (anglogram) annotated keywords using LSA
1% improvement
21% improvement
Image (global color histogram) annotated keywords using LSA
3% improvement
![Page 65: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/65.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 66: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/66.jpg)
Web Page Structure Genre detection We do the following:
Display web page in the program Get tag hierarchy with area co-ordinates Normalize the web page to size 512 * 512 Divide page in 16*16 blocks Calculate area covered by each tag in each block
considering the level of the tag in tag hierarchy For each feature tag get the center coordinates of the
blocks where it is covering maximum area as compared with other tags on the same level
![Page 67: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/67.jpg)
Web Page Structure
![Page 68: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/68.jpg)
Web Page Structure
![Page 69: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/69.jpg)
Web Page Structure
Histogram 36 bins with two large angles Tags independent of level
Try approach where tag on lower level overrides upper-level tag
![Page 70: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/70.jpg)
Web Page Structure
Set of tags defined - Initially, a large set of feature tags (52) is
defined to ensure a powerful set of independent features for the discrimination of web pages
A second set of tags (3) is defined based on histograms created for initial set of tags so that these tags will better differentiate web pages
![Page 71: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/71.jpg)
Web Page Structure
Experiment # 1 Categories defined are
Detroit News Times of India Tribune India Esakal Amazon.com Buy.com
![Page 72: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/72.jpg)
Web Page Structure
Cluster category based on closest page
Matches Failures
52 tags 26 10
3 tags 27 9
![Page 73: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/73.jpg)
Web Page Structure
Experiment # 2 Categories defined are
News paper environment Detroit News Times of India Tribune India Esakal
e - Commerce environment Amazon.com Buy.com
![Page 74: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/74.jpg)
Web Page Structure
Matches Failures
52 tags 33 3
3 tags 33 3
![Page 75: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/75.jpg)
Contents Introduction
CBR – Where are we? Multimedia annotation Context-rich environments Semantic web
Our work Anglograms Finding latent semantics Using text for improved image search Using images for improved text search Web page structure A cross-modal theory of linked document semantics
![Page 76: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/76.jpg)
A Cross-Modal Theory of Linked Document Semantics
Environment Suppose one has a linked set of multimedia
documents Web Content-based hypermedia
This provides a rich context for individual chunks of information
The structure of individual multimedia documents The link structure
![Page 77: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/77.jpg)
A Cross-Modal Theory of Linked Document Semantics
Goal Derive document semantics based on user
browsing behavior The same document has multiple semantics
Different people see different meanings in the same document
Over short browsing paths, an individual user’s wants and needs are uniform
The pages visited over these short paths exhibit semantics in congruence with these wants and needs
![Page 78: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/78.jpg)
A Cross-Modal Theory of Linked Document Semantics
Questions How can the semantics of a web page be derived
given a set of user browsing paths that end at that page?
How can we characterize the semantics of a user browsing path?
How can web page semantics help us in navigating the web more efficiently?
How can our approach actually be implemented in the real web world?
![Page 79: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/79.jpg)
A Cross-Modal Theory of Linked Document Semantics
Our approach We use actual browsing paths to find the
latent semantics of web pages Textual features Image features Structural features
We hope to find general concepts comprising various textual and image features which frequently co-occur
![Page 80: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/80.jpg)
A Cross-Modal Theory of Linked Document Semantics
We believe that a user’s browsing path exhibits semantic coherence While the user’s entire path exhibits multiple
semantics, especially pages far from each other on the path, neighboring pages, especially the portions close to the links taken, are semantically close to each other
![Page 81: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/81.jpg)
A Cross-Modal Theory of Linked Document Semantics
We would like to characterize the contiguous sub-paths of a user’s browsing path that exhibit similar semantics and detect the semantic break points along the path where the semantics appreciably change Collect these sub-paths into a multiset
![Page 82: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/82.jpg)
A Cross-Modal Theory of Linked Document Semantics
We categorize the semantics of each web page based on a history of the semantically-coherent browsing paths of all users which end at that page
A browsing path will be represented by a high-dimensional vector
The various positions of the vector correspond to the presence of textual keywords image features (visual keywords) structural features (structural keywords)
![Page 83: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/83.jpg)
A Cross-Modal Theory of Linked Document Semantics
From the complete set of web pages under consideration, we extract a set of textual, visual, and structural keywords
For each multiset, M, of sub-paths that we are to analyze, we form three matrices term-path matrix image-path matrix structure-path matrix
![Page 84: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/84.jpg)
A Cross-Modal Theory of Linked Document Semantics
The (i,j)th element of these matrices are determined by Strength of the presence of ith keyword along the jth
browsing path Determined by
How many times this term occurs on the pages along the path How much time the user spends examining these pages How close each occurrence of the ith keyword is to both the
outgoing and incoming anchor positions How many times this browsing path occurs in M
![Page 85: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/85.jpg)
A Cross-Modal Theory of Linked Document Semantics
These matrices may be concatenated together in various ways to produce an overall keyword-path matrix
Perform latent-semantic analysis to get concepts
A page is then represented by a set of concept classes
![Page 86: The Multimedia Semantic Web](https://reader036.vdocuments.site/reader036/viewer/2022062521/56816867550346895ddecb17/html5/thumbnails/86.jpg)
Conclusions Researchers in CBR should now be
concentrating on extracting semantics from multimedia documents
The web is a perfect testbed for studying semi-(automated) techniques for multimedia annotation due to contextual richness
CBR + Semantic Web = The Multimedia Semantic Web
Get Involved!!!