semantic search engines – on the way to web 3.0

Post on 19-Mar-2016

55 Views

Category:

Documents

2 Downloads

Preview:

Click to see full reader

DESCRIPTION

Semantic Search Engines – On the Way to Web 3.0. מנועי חיפוש סמנטיים – Web בדרך ל-3.0. אריאל פרנק מחלקה למדעי המחשב אוניברסיטת בר-אילן ariel@cs.biu.ac.il. Contents. Web 3.0 & Semantic Search General Search "Natural Language" Search Vertical Search "Social Networking" Search - PowerPoint PPT Presentation

TRANSCRIPT

A. Frank

Semantic Search Engines – On the Way to Web 3.0

– מנועי חיפוש סמנטייםWeb -3.0בדרך ל

אריאל פרנקמחלקה למדעי המחשב

אוניברסיטת בר-אילןariel@cs.biu.ac.il

A. Frank2

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank4

“The good, the bad and the”…

A. Frank5

Web 1.0, Web 2.0, Web 3.0, Web X.0…

A. Frank6

Semantic Search

• Syntactic search – can match the query against – index of the textual content of the resources– URIs (URLs, URNs) in the system – literals in the RDF metadata– or a combination of these, possibly using:

• Exact, prefix or substring match, stemming, minimal edit distance

• Semantic search – in addition to syntactic search, can use– index of the meaning of sentences in each resource – semantic information and analysis– the graph structure of RDF metadata– or a combination of these, possibly using:

• query expansion, classification/categorization, tagging, graph traversal, microformats, RDF & OWL inferencing and reasoning

A. Frank7

Can Semantic SEs answer this)?-:

A. Frank8

Types/Examples of Semantic SEs

• General Search– MetaWeb Freebase, Yahoo! Microsearch, …

• "Natural Language" Search– Powerset, Hakia, AskMeNow AskWiki, …

• Vertical Search– Kango, AdaptiveBlue, ReportLinker, …

• "Social Networking" Search– SemantiNet, Delver, Google Social Graph API, …

• Personalized Search– Twine, MavinIT PSS, …

A. Frank9

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank10

MetaWeb Technologies - Freebase

• Based in San Francisco, MetaWeb Technologies was spun out of Applied Minds in July 2005.

• Goal: build a better infrastructure for the Web application developers and publishers.

A. Frank11

Freebase Rational• Open, shared database of the world’s knowledge that

collects data from the Web to build a massive, collaboratively-edited database of cross-linked data.

• It is built by the community, for the community. • Free for anyone to query, contribute to, build

applications on top of, or integrate into their Web sites.• Focus is on organizing and managing complex data

structures by use of Semantic Web technologies.• Enables extraction of ordered knowledge out of the

information chaos that is the current Web.

A. Frank12

Freebase

A. Frank13

Freebase Repository

• Covers millions of topics in hundreds of categories.• Draws from large open repositories like Wikipedia,

MusicBrainz, and the SEC archives.• Contains structured information on many popular

topics, like movies, music, people and locations – all reconciled and freely available via an open API.

• Freebase information is supplemented by the efforts of a passionate global community of users, who are working together to add structured information on everything relevant.

A. Frank14

Domains and Types

A. Frank15

Google Company

A. Frank16

Freebase Help Center

A. Frank17

Freebase Semantics• Freebase spans domains, but requires that a particular

topic exist only once, even if it might normally be found in multiple databases.

• For example, Arnold Schwarzenegger would appear in a movie database as an actor, a political database as a governor and a bodybuilder database as a Mr. Universe.

• In Freebase, there is only one topic for Arnold Schwarzenegger, with all three facets of his public persona brought together.

• The unified topic acts as an information hub, making it easy to find and contribute information about him.

A. Frank18

Arnold Schwarzenegger (1)

A. Frank19

Arnold Schwarzenegger (2)

A. Frank20

Freebase Dynamics• If the user is a developer, or just mildly technical,

Freebase offers tools that make it easy to query and integrate the data into Web applications, blogs, wikis, user pages or anything else that would benefit from an injection of structured information. 

• In addition to reconciling many facets of one topic, the underlying structure of Freebase lets the user run more complex queries.

• For example, if Freebase is asked for films starring Jennifer Connelly and actors who have appeared in Steven Spielberg movies, a list of 8 movies is given.

A. Frank21

…Films starring Jennifer Connelly

A. Frank22

Freebase vs. Wikipedia• The difference lies in the way they store information. • Wikipedia arranges information in the form of articles.• Freebase lists facts and statistics. Its list form is good

not only for people who like to glance at facts, but also for people who want to use the data to build other Web sites and software. (Information in an article form can’t be reused in the same way.)

• Topics covered by Freebase include subjects that are too obscure for Wikipedia, which strives for notability appropriate to an encyclopedia.  

A. Frank23

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank24

Powerset• Powerset is a Silicon Valley

company.• Goal: build a transformative

consumer search engine based on Natural Language Processing (NLP).

A. Frank25

Powerset Rational• Unlike conventional search engines that use keywords,

Powerset reads and understands every sentence on a Webpage and allows asking questions in plain English.

• Unique innovations in search are rooted in breakthrough technologies that take advantage of the structure and nuances of natural language.

• Using these advanced techniques, Powerset is building a large-scale search engine that breaks the confines of keyword search.

• By making search more natural and intuitive, Powerset is fundamentally changing how we search the Web, by delivering higher quality results.

A. Frank26

Who proved Fermat’s last theorem?

A. Frank27

What did Steve Jobs say about the iPod?

A. Frank28

What did Bush say about Gore?

A. Frank29

Powerlabs

• Powerlabs is a community where users can:– interact with demonstrations of Powerset’s

technology before search engine launches in 2008– give feedback to help improve the "Natural

Language" indexing– suggest ideas for the ideal search engine.

• Utilizes the participation of users on such a scale and at such an early stage of development, as a recognition of the potential of crowds wisdom to guide Powerset.

A. Frank30

Powerlabs Sign In

A. Frank31

Wiki Search Sneak Peek• Access to first open search box covering Wikipedia.• Powerset uses linguistic analyses of both the query

and Wikipedia to find the best matches. • The Miniviewer allows to view highlighted matches

in the context of a Wikipedia article without ever having to leave the results page.

• By incorporating semantic information from Powerset’s indexing process into republished Wiki pages, internal page search enables a whole new kind of search: semantic-search-within-the-page.

A. Frank32

Explore Wikipedia

A. Frank33

Google acquire something

A. Frank34

Google acquire company

A. Frank35

Search Wikipedia

A. Frank36

Companies acquired in 2001

A. Frank37

Powerset PowerMouse

• PowerMouse is an application that provides a view into Powerset’s technology, letting users examine how structured information is extracted from open text.

• It is not intended as a search application per se, but allows to search for and navigate through facts encoded in Powerset’s Wikipedia index.

• It allows to see in dramatic fashion how compactly large amounts of data can be organized and displayed based on a few semantic relationships.

A. Frank38

PowerMouse Examples

A. Frank39

Google acquire something

A. Frank40

something eats carrot

A. Frank41

person won nobel

A. Frank42

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank43

Kango

• Vertical semantic search engine for personalized travel information.

• Goal: first step to deciding where to go, where to stay or what to do; finds the trip that is right for you.

A. Frank44

Kango Rational

• Kango indexes the collective wisdom on travel from the entire Web.

• Recommendations are based on a gestalt of voices heard in over 20 million reviews, ratings, blogs, journals, and articles collected from over a thousand sources such as Web sites, books and magazines.

• Organizes and presents the most relevant opinions and product details in a "federated" search display based on what’s known about travel preferences.

A. Frank45

Kango Repository

• Kango has scoured the Web to collect all kinds of places to go, things to do and places to stay.

• It then analyzed and organized millions of travelers' opinions to enable search based on exact travel requirements and preferences.

• Kango brings together:– more than a thousand sites– 400,000 lodging, activity and destinations options – 20 million reviews, ratings and blogs.

A. Frank46

How Kango Works

A. Frank47

Kango Semantics

• It provides many options for specifying a trip.• Kango thinks about those options in terms of

the “Long Tail“ concept to help make the trips distinct and memorable.

• It "understands" the travel lingo, so it helps make informed decisions about what best fits specific travel preferences for each user.

• Kango is creating an ontology of global travel content that includes ranking of superlatives within review sites.

A. Frank48

Lodging

A. Frank49

Things to Do

A. Frank50

Kango Dynamics

• Enables new ways of filtering through its collection to get the recommendations that are most relevant to preferences and priorities.

• Based on persons traveled with, the kind of destination looked for, and what is likely to be done, it sifts through its information to deliver the right getaway.

• For example, returns– one set of hotel and activity recommendations when

traveling to Monterey for a romantic getaway – a different set when going to Monterey with the family to

visit the aquarium and hang out on the beaches.

A. Frank51

Old Monterey Inn

A. Frank52

Campgrounds in Hawai

A. Frank53

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank54 A. Frank

SemantiNet• SemantiNet is a startup, based

in Tel Aviv, that is creating a new revolutionary technology that is based on Semantic Web concepts.

• Goal: leverage Web information in a meaningful way to boost the manner users experience the Internet.

A. Frank55 A. Frank

SemantiNet Rational

• SemantiNet makes life easy by allowing users to take advantage of the variety and richness of information and services that exist on the Internet, but in a way that is simple, smart and intuitive.

• SemantiNet leverages Semantic Web concepts to seamlessly integrate information and services enabling users to achieve more while working less!

A. Frank56 A. Frank

SemantiNet Repository

• SemantiNet collects relevant information from common social networks and established Web sites in order to provide users with a customized and efficient personalized and contextual browsing experience.

• Relevant personal information can be – entered on their Web site – provided by users through use of SemantiNet – or extracted from "traffic data" generated by

browser use.

A. Frank57 A. Frank

SemantiNet Semantics

• Develops a semantic framework solution that allows for rapid deployment of Web mashups, applications and services, in a way that enhances the way people use the internet.

• Rather than simply aggregating information, SemantiNet’s technology, integrates information as well as mashing it as needed.

• The idea is to bring the relevant online content to the user rather than the user to the content.

A. Frank58

SemantiNet Demo

A. Frank59

SemantiNet Demo

A. Frank60

SemantiNet Demo

A. Frank61

SemantiNet Demo

A. Frank62

Example of Social Graph

A. Frank63

Delver• Delver (formerly Semingo) is

headquartered in Herzeliya and will officially open U.S. offices in Silicon Valley in spring of 2008.

• Goal: provide a semantic search engine that allows users to search for information created and referenced by their own social graph.

A. Frank64

Delver Rational• Delver provides a “connected search engine” that

allows users to find content, media and people within their network via a simple search interface.

• Delver organizes and ranks content from the user’s network because social connections are critical for discovering more personally relevant information.

• It indexes the social Web (social networks, blogs, social applications, etc.), and cross-connects the data with users’ social graph.

• Improves the relevancy of Web search results by prioritizing these results based upon the specific searcher’s social network.

A. Frank65

Delver Repository• Delver begins by crawling the Web in order to map

users’ social connections. • It specifically indexes people's social connections on

flickr, MySpace, LinkedIn, YouTube, hi5, facebook, Blogger, and more sites are being added all the time. 

• Instead of just looking at a Web site's popularity, Delver looks at information like whether your friends have tagged the site or if it's found on their social network profiles, bookmarking sites, photos and video sharing sites, or on their blogs.

• The results are more relevant because they account for who a person is and what it finds valuable.

A. Frank66

Liad Agmon

A. Frank67

Venture Funding

A. Frank68

Delver Semantics• Delver knows who a user is and who his friends are

even if users didn't import their address book or add their "Social Networking" profiles.

• Instead, Delver leverages the social graph to map out a user's social connections.

• Since everyone's social graph is unique, like a fingerprint, the same Delver query will yield significantly different results for each user – as reflected through the collective experiences of each person’s contacts.

• The results are more personal and meaningful to users than a generic search using a "normal" search engine.

A. Frank69

Delver Dynamics

• When a user performs a query, results from all over his social Web are displayed.

• Even if a user and others are not directly related as "friends" on a social network, the plus sign the beneath picture can still be clicked to add them as a connection.

• This way, a user can view the relevant bookmarks, links, blog posts, photos, and videos of people like him even if he doesn’t know them personally... and they don't have to confirm the connection on their end.

• Alternately, a user can choose to exclude certain connections from his search results.

A. Frank70

Roi Carthy

A. Frank71

Visit New York

A. Frank72

Contents

• Web 3.0 & Semantic Search• General Search • "Natural Language" Search• Vertical Search• "Social Networking" Search• Personalized Search

A. Frank73

Radar Networks Twine• Radar Networks, a pioneer of

Semantic Web technology, introduced Twine.

• Goal: enables individuals and groups to organize, share and discover information and knowledge around their interests.

A. Frank74

Twine Rational• Twine is a "knowledge networking" tool designated

as a revolutionary Semantic Web application. • It is a new service that helps organize, share and

discover information about user interests, with networks of like-minded people.

• Twine can be used alone, with friends, groups and communities, or even in a company.

• It has aspects of social networking, wikis, blogging, knowledge management systems – but its defining feature is that it's built with Semantic Web technologies.

• It aims to bring a usable and scalable interface to the long-promised dream of the Semantic Web.

A. Frank75

Twine Repository• Using Twine, a user can:

– add content via Wiki functionality (has many post types)– email content into the system – and "collect" something (as an object, e.g., a book object).

• Twine ties it all together:– As information is added to Twine, it is automatically tagged

so that it can be easily found. – Users can connect with individuals and groups, gather and

share content, and engage in discussions around interests. – Twine connects between new people, content and products

that match their interests, and also helps users discover other people and their contributions.

A. Frank76

Twine Semantics• Twine is powered by semantic understanding.• At first glance it is very much like Wikipedia, but

there is a whole lot more smarts to the system.• It's not based around socializing, but aims to share

information and automatically organize it, learn about user interests, and make varied connections and recommendations.

• The more it is used, the better it understands the user interests and the more useful it becomes.

• It is a "Semantic Graph", which maps relationships to both people and topics.

A. Frank77

Twine Sign In

A. Frank78

Twine Dynamics

• Enables user commenting and viewing of related things.

• Allows sharing of tags.• Enables import and export of user own data. • RSS feeds to track all kinds of things (topics,

events, search, etc).• Semantic Web technologies are being used:

RDF, OWL, SPARQL, XSL, GRDDL.• An open platform - there will be SPARQL and

REST APIs.

A. Frank79

Welcome Steve to Twine

A. Frank80

Explore Green Business and Investing

A. Frank81

Steve Smith’s Twine

A. Frank82

Explore Green Tech

A. Frank83

Semantically up(?-:

A. Frank84

Where does the MetaWeb fit!?

A. Frank85

References

• Web 3.0, In Wikipedia, The Free Encyclopedia, http://en.wikipedia.org/w/index.php?title=Web_3.0&oldid=123368293

• Entrepreneurs See a Web Guided by Common Sense, John Markoff , New York Times, November 12, 2006, http://www.nytimes.com/2006/11/12/business/12Web.html?ex=1320987600&en=254d697964cedc62&ei=5088

• Parts I & II: A Smarter Web, John Borland, Technology Review, March 19-20, 2007, http://www.technologyreview.com/Infotech/18396/

A. Frank86

References• M. Hildebrand, J. R. van Ossenbruggen, L. Hardman,

An Analysis of Search-based User Interaction on the Semantic Web, Report INS-E0706, May 2007, 6th Intl. Semantic Web Conference, November 2007, http://ftp.cwi.nl/CWIreports/INS/INS-E0706.pdf

• Jim Hendler, Web 3.0: Chicken Farms on the Semantic Web, IEEE Computer, January 2008, http://www.computer.org/portal/site/computer/menuitem.5d61c1d591162e4b0ef1bd108bcd45f3/index.jsp?&pName=computer_level1_article&TheCat=1075&path=computer/homepage/0108&file=Webtech.xml&xsl=article.xsl&

• Richard Waters, World-wise Web?, Financial Times, http://www.ft.com/cms/s/0/4fba0434-e98c-11dc-8365-0000779fd2ac.html?nclick

top related