clustering wsdl documents to bootstrap the discovery of web services

24
Clustering WSDL Documents to Bootstrap the Discovery of Web Services Web Services Discovery

Upload: ronson1989

Post on 12-Nov-2014

1.378 views

Category:

Technology


1 download

DESCRIPTION

Reading ICWS2010 "Clustering WSDL Documents to Bootstrap the Discovery of Web Services"

TRANSCRIPT

Page 1: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Web Services Discovery

Page 2: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Outline

Introduction Related Work Our Approach Experiments Conclusion

Page 3: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Introduction

Major providers decided to publish WS through their own websites instead of public registries

UDDI Busine

ss Registr

y

Search

engine

47%92%

Page 4: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Introduction

Problem of search engine If the search query doesn’t contain part

of the service name exactly, the service may not be retrieved

User may even miss services that use synonyms or variations of keywords car -> vehicle

Page 5: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Outline

IntroductionRelated Work Our Approach Experiments Conclusion

Page 6: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Related work

Using the Jaccard coefficient to calculate the similarity between Web services. (Richi Nayak 2008) provides the user with related search terms based on

other users’ experiences with similar queries Web services search engine Woogle (Xin Dong 2004)

that is capable of providing Web services similarity search. Does not adequately consider data types

Apply text mining techniques to extract features such as service content, context, host name, and name, from Web service description files in order to cluster Web services(Wei Liu 2009) service context and service host name features offer little

help in the clustering process

Page 7: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Outline

Introduction Related WorkOur Approach Experiments Conclusion

Page 8: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Big picture

Page 9: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Features Extraction

Mine the WSDL documents to extract features that describe the semantic and behavior of the Web service WSDL content WSDL types WSDL messages WSDL ports Web service name

Page 10: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Features Extraction Process

Page 11: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Feature 1: WSDL Content

Ti={types, message, weather, zipcode, web, forecast,

forecasting, is..}

Ti={weather, zipcode, web, forecast,

forecasting, is…}

Ti={weather, zipcode, web, forecast, is…}

Ti={weather, zipcode, web, forecast…}

Ti={weather, zipcode, forecast..}

Parsing WSDL

Tag removal

Word stemming

Function word

removal

Content word

recognition

Page 12: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Function word removal

Function word: is, a, do.. Content word: weather, zipcode..

Page 13: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Content word recognition

Apply k-means clustering algorithm with k=2 on Ti

use Normalized Google Distance (NGD) as a featureless distance measure between words

{weather, zip,

zipcode, forecast, place}

{response, bind,

data, post, port,

target}

{runtime, bind, web,

service, module,

data, post}

Web service specific cluster Predefined cluster

Non-Web-service-specific

cluster

Page 14: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

WSDL types, messages, ports

Page 15: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Feature 2,3,4

Feature 2: WSDL Types (complexType)

the type attribute is a good candidate for describing the functionality of a service.

Feature 3: WSDL Messages Feature 4: WSDL Ports

Page 16: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Feature 5: Web Service Name

We consider the Web service name used in the URI of the WSDL document

http://www.webservicex.net/WeatherForecast.Asmx?WSDL

the name of the Web service is ”Weather Forecast”

Page 17: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Features Integration and clustering

We use the Quality Threshold (QT) clustering algorithm to cluster similar Web services based on the five similarity features presented above.

Similarity factor between web service si and sj

Page 18: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Outline

Introduction Related Work Our ApproachExperiments Conclusion

Page 19: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Experiments

Two criteria Precision: exactness Recall: completeness

Page 20: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Experiments

400 online web services Manual classification, serve as a

comparison point for clustering algorithms ”Currency exchange”, ”Weather”,

”Address validation”, ”E-mail verification”, and ”Credit card services”

Page 21: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Results

High Precision and Recall

Page 22: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Outline

Introduction Related Work Our Approach ExperimentsConclusion

Page 23: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Conclusion

We propose an approach to improve service discovery of non-semantic Web services by clustering similar services through mining WSDL documents

Future work: plan to improve features integration by choosing optimized weights for each feature using a linear programming approach

Page 24: Clustering WSDL Documents to Bootstrap the Discovery of Web Services

Thanks