daniel · matera mashups · fig. 5.3 model of a restful web service delivering representations of...
TRANSCRIPT
1
Daniel · Matera
Data-Centric Systems and ApplicationsMichael J. Carey · Stefano Ceri Series Editors
Florian Daniel · Maristella Matera
Mashups Concepts, Models and Architectures
Data-Centric Systems and Applications
MashupsFlorian DanielMaristella Matera
Computer Science
DCSA
Mashups Concepts, Models
and Architectures
Mashups have emerged as an innovative soft ware trend that re-interprets existing Web building blocks and leverages the composition of individual components in novel, value-adding ways. Additional appeal also derives from their potential to turn non-programmers into developers.
Daniel and Matera have written the fi rst comprehensive reference work for mashups. Th ey systematically cover the main concepts and techniques underlying mashup de-sign and development, the synergies among the models involved at diff erent levels of abstraction, and the way models materialize into composition paradigms and archi-tectures of corresponding development tools. Th e book deliberately takes a balanced approach, combining a scientifi c perspective on the topic with an in-depth view on relevant technologies. To this end, the fi rst part of the book introduces the theoretical and technological foundations for designing and developing mashups, as well as for designing tools that can aid mashup development. Th e second part then focuses more specifi cally on various aspects of mashups. It discusses a set of core component tech-nologies, core approaches, and architectural patterns, with a particular emphasis on tool-aided mashup development exploiting modeldriven architectures. Development processes for mashups are also discussed, and special attention is paid to composition paradigms for the end-user development of mashups and quality issues.
Overall, the book is of interest to a wide range of readers. Students, lecturers, and researchers will fi nd a comprehensive overview of core concepts and technological foundations for mashup implementation and composition. Even without low-level coding details, practitioners like soft ware architects will fi nd guidance on key imple-mentation concepts, architectural patterns, and development tools and approaches. A related website provides additional teaching material which can be used either as part of a course or for self study.
This book is timely, provides a through scientific investigation and also has prac-tical relevance in the general area of composition and mashups. It is of particular interest to researchers and professionals wishing to learn about relevant concepts and techniques in service mashups, composition, and end-user programming. From the Preface by Boualem Benatallah, University of New South Wales, Sydney
9 7 8 3 6 4 2 5 5 0 4 8 5
ISBN 978-3-642-55048-5
Chapter 5 Mashup Components Figures
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
104 5 Mashup Components
the development of mashups is a component like the one modeled in Figure5.1, which supports the reuse of either data or business logic (via a dedicatedoperation), a piece of user interface, or both.
Component
Operation User interface
0..N{or}
0..1
Fig. 5.1 The most basic component model consists of either one operation that canbe invoked or of a piece of UI that can be rendered, or of both.
Whether a component supports just operations, also UIs, or only UIs, aswell as which kinds of operations (e.g., synchronous vs. asynchronous opera-tions), which UI formatting logic, etc. depends on the individual componenttechnology chosen to implement the component. We will see in the follow-ing sections how each component technology expands this basic componentmodel, implementing di↵erent logics and interaction paradigms.
The conceptual models we use in this chapter aim to highlight commonalitiesand di↵erences of the various component technologies. Part of this goal is alsothe establishment of a common terminology to describe the technologies. Weparticularly focus on the core, underlying concepts and principles and onthose aspects that a↵ect their use, especially in the light of the integrationneeds of mashups. As such, the models presented in this chapter provide aconceptual, usage-oriented view without the claim of completeness regardinglow-level technicalities or component-internal properties. The models shouldalso not be interpreted as meta-models of component description languages,such as WSDL [75] or WADL [135], a topic we also touch in this chapter.
5.2.2 Component characteristics
Given the large variety of available component technologies, it is not trivialto bring them all under one hood. This exercise requires identifying a set ofcharacteristic dimensions that allow us to compare technologies on a commonground. Next, we introduce a set of dimensions that we think allow us to ad-equately characterize component technologies, while in the following sectionswe use them to actually describe concrete technologies.
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.3 Logic Components 113
Message
Request-response operation
Solicit-response operation
Schema
complies with
1..N
0..1 0..1 0..1 0..1
has fault
Web service
Name
Endpoint
Protocol binding
Operation
Name
has outputhas input
has input
has output
1..N
The service's
business protocol
specifies the order
in which operations
can be invoked.
For synchronous
communications
For asynchronous
communications
One-way operation
Notification operation
has input0..1 0..1 has output
Fig. 5.2 Conceptual model of a web service consisting of a set of four di↵erent typesof message-based operations.
The former two operations provide support for synchronous transmissionstyles, the latter two for asynchronous styles. The semantics of each operationis decided by the web service (by its developer) and can usually be derivedfrom an operation’s name and input/output messages. Each of the messagessent by either the web service or its client complies with a schema that isgiven by the web service and describes how data are to be formatted to becompatible with the web service. We use the term schema, as the payload ofthe messages is typically an XML dialect (such as SOAP itself). That is, themedia type of the service is discrete; streaming is not supported (if not inthe form of sequences of discrete events, i.e., multiple notifications).
Regarding the instantiation model of web services, both stateless and state-ful models are supported and only depend on the web service’s internal im-plementation. Stateless web services, though, are not able to take actionautonomously, that is, they do not support solicit-response and notificationoperations. Stateful web services support all four types of operations andmay require the addition of correlation information to the input messages ofrequest-response and one-way operations, in order to tell which instance ofthe web service the specific invocation is referring to (in the presence of state-ful components, we always must assume there are multiple instances runningin parallel on the same server).
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.3 Logic Components 115
Get operation
Delete operation
Post operation
Put operation
RESTful web service
Name
Entry point
Representation
Media type
1..N1..N
complies with1..N
LinkSchema
Resource
NameURL
HTTP operation
HTTP status code
renders
contains
references
0..N
1..4
supports
producesconsumes
0..1
produces
0..1
0..N
reads
creates
updates
deletes
Message
complies with
0..N
0..N
The business protocol is
discovered incrementally by
navigating links to resources.
0..1
0..1
Fig. 5.3 Model of a RESTful web service delivering representations of and manipu-lating resources in response to standard HTTP requests.
The best example of RESTful web services is, interestingly, a common webapplication. It typically manages a set of resources in its internal databaseand business logic. What is visible from the outside are the HTML pages sentfrom the server to the client; these are the representations of the resources.The pages contain hyperlinks, which allow the user to navigate from one pageto another, carrying over state information either in the query of the linkitself (which corresponds to an HTTP Get operation) or in the body of therequest sent to the client (which corresponds to an HTTP Post operation). ARESTful web service is very similar to a web application, with the di↵erencethat it is not oriented toward human users but machines and, therefore, itsresponses are rather formatted in XML or JSON, instead of HTML.
Figure 5.3 illustrates our conceptual view on RESTful web services, whichwe identify by a name and an entry point (typically, a URL). As explained inthe example, a RESTful service is composed of two key ingredients: resourcesand representations. The resources are the actual assets managed and madeavailable by the service. A flight booking service manages, for example, flights,bookings, payments, and customers. Which exact resources a RESTful webservice uses internally we do not really know. What we know from the outsideare the representations of the resources that are made accessible via the
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
118 5 Mashup Components
JavaScript file
NameURL
JavaScript API
NameHomepage
0..N
has
1..N
deployed as
0..N
Object
Name
Event
Function
0..N
0..N
Parameter
NameType
has 0..N
0..N consumes
declared in
1..N
produces
Event handler consumes
0..Nhas
0..N
For asynchronous
communications
produces
For synchronous
communications
User interface
0..1 0..N
controls
Fig. 5.4 Model of a typical JavaScript API. The gray shaded entity is not part of theactual component model; it tells which artifacts the developer must deploy, in orderto be able to use the library. Pure logic components do not have a user interface.
well-known JQuery library for fast DOM processing and dynamically format-ting HTML inside the browser is a good example of a library.
JavaScript libraries are also a good means to overcome lacking or non-standard browser implementations of JavaScript APIs. For instance, sup-port by the various web browsers for the new features introduced in HTML5, i.e., for some of its APIs, is still weak or non-standard across browsers.This fact has, for instance, led to the development of Modernizr (http://modernizr.com/), a very powerful JavaScript library that is able totell which HTML 5 and CSS 3 features are supported by a browser and todynamically load JavaScript libraries to backfill missing functionalities.
JavaScript APIs and libraries are components that are local to the client(the web browser) and that may require installation. Typically, browser APIsare built-in and directly available in the browser’s JavaScript runtime envi-ronment. Libraries, instead, need installation on the client, so as to be avail-able in the JavaScript runtime environment. In Figure 5.4, we highlight client-side artifacts graphically with a gray shaded entity, in order to distinguishthem from the actual component model. In practice, the the installation ofthe JavaScript library requires either downloading the respective JavaScriptfile form the libraries home page and installing it locally on the own webserver or it is enough to simply link the file via its URL. Once installed, thereis no di↵erence between APIs and libraries.
An API comes in the form of one or more JavaScript objects (a librarymay optionally also have an own UI; we treat this case when discussing userinterface components). The objects have functions that allow invocations vialocal procedure calls (actually, function calls), but, given JavaScripts nativeevent-based flavor, objects may also handle and launch events for event-basedcommunications. This means that JavaScript APIs/libraries support both
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
122 5 Mashup Components
<rss version="2.0"><channel><title>Liftoff News</title><link>http://liftoff.msfc.nasa.gov/</link><description>Liftoff to Space Exploration.</description>...<item>
<title>Star City</title><link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link><description>
How do Americans get ready to work with Russians aboard theInternational Space Station? They take a crash course in culture,language and protocol at Russia’s<a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star City</a>.
</description><pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate><guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
</item>...
</channel></rss>
Fig. 5.5 An excerpt of the RSS 2.0 news example by the RSS Advisory Board(http://www.rssboard.org/files/sample-rss-2.xml).
RSS feed
Name
Entry point
Representation
MediaType {="application/rss+xml"}
Schema {="RSS Specification"}
renders
Get operationsupports
produces
reads
The media type
and the schema are
fixed by the RSS
specification.
Resource
Name
URL
0..1
Fig. 5.6 The model of an RSS feed is simplified version of the model we presentedfor RESTful web services (see Figure 5.3).
description are required elements, while language, publication date, image,and similar are optional elements.
The distribution convention is that the providers of a publicly availablefeed allow consumers of their feed to freely access and redistribute it and toinclude it in the consumers’ own web sites without requiring them to ask forany permission. Commonly, RSS feeds do therefore not provide complete con-tent in their items (e.g., a full newspaper article) and rather publish the titleand an excerpt of it and link the reader to the provider’s own web site (e.g.,the newspaper web site), which may have its own content reuse and licensingrules. For example, the listing in Figure 5.5 shows a self-explaining excerptof the RSS example provided by the RSS Advisory Board, the group of peo-
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
122 5 Mashup Components
<rss version="2.0"><channel><title>Liftoff News</title><link>http://liftoff.msfc.nasa.gov/</link><description>Liftoff to Space Exploration.</description>...<item>
<title>Star City</title><link>http://liftoff.msfc.nasa.gov/news/2003/news-starcity.asp</link><description>
How do Americans get ready to work with Russians aboard theInternational Space Station? They take a crash course in culture,language and protocol at Russia’s<a href="http://howe.iki.rssi.ru/GCTC/gctc_e.htm">Star City</a>.
</description><pubDate>Tue, 03 Jun 2003 09:39:21 GMT</pubDate><guid>http://liftoff.msfc.nasa.gov/2003/06/03.html#item573</guid>
</item>...
</channel></rss>
Fig. 5.5 An excerpt of the RSS 2.0 news example by the RSS Advisory Board(http://www.rssboard.org/files/sample-rss-2.xml).
RSS feed
Name
Entry point
Representation
MediaType {="application/rss+xml"}
Schema {="RSS Specification"}
renders
Get operationsupports
produces
reads
The media type
and the schema are
fixed by the RSS
specification.
Resource
Name
URL
0..1
Fig. 5.6 The model of an RSS feed is simplified version of the model we presentedfor RESTful web services (see Figure 5.3).
description are required elements, while language, publication date, image,and similar are optional elements.
The distribution convention is that the providers of a publicly availablefeed allow consumers of their feed to freely access and redistribute it and toinclude it in the consumers’ own web sites without requiring them to ask forany permission. Commonly, RSS feeds do therefore not provide complete con-tent in their items (e.g., a full newspaper article) and rather publish the titleand an excerpt of it and link the reader to the provider’s own web site (e.g.,the newspaper web site), which may have its own content reuse and licensingrules. For example, the listing in Figure 5.5 shows a self-explaining excerptof the RSS example provided by the RSS Advisory Board, the group of peo-
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.4 Data Components 123
Get operation
Delete operation
Post operation
Put operation
Atom feed
Name
Entry point
Representation
MediaType {="application/atom+xml"}
Schema {="Atom Syndication Format"}
1..N1..N
Link
Resource
NameURL
HTTP operation
HTTP status code
renders
contains
references0..N
1..N
supportsproduces
consumes
0..1
produces
0..N
reads
creates
updates
deletes
Message
Schema {="Atom
Syndication Format"}
The media type and the schema
are fixed by the Atom Syndication
Format standard.
0..1
The business protocol is regulated by the
Atom Publishing Protocol specification
Fig. 5.7 The model of an Atom feed including the features specified in the AtomPublishing Protocol is that of a RESTful web service with schema and media typerestrictions.
ple that publishes and maintains the RSS specifications (there are di↵erentversions of the specification).
We describe the model of RSS feeds in Figure 5.6. An RSS feed is essen-tially a RESTful web service that has one resource (the channel) and thatimplements only one of the standard HTTP operations, i.e., the Get opera-tion. This operation allows the consumer of a feed to obtain the representationof the channel, which is formatted according to the RSS specification. Thereis an own media type for RSS feeds (application/rss+xml). We omit theHTTP status code entity from the model, since status codes are not managedby the RSS feed itself; the codes a consumer of the feed gets are generatedby the web server.
Therefore, all the properties that hold for RESTful web services also holdfor RSS feeds, except those that refer to the incremental discovery of the ser-vice’s protocol (in RSS the links are only links referencing external resources,not own resources of the feed), the use of operations (we only have the Getoperation), and the media type and schema (which are fixed for RSS feeds).
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
126 5 Mashup Components
The interactive point-and-click data extraction canvas allowing the user to select and deselect HTML elements for data extraction
A preview of the data that can be extracted from the source web page with the user's current selection
The list of identified and named data items to be extracted
Overview of the steps of the data extraction process
Selected content to be extracted
Fig. 5.8 Screen shot of the Dapper content extraction tool in action: interactiveextraction of data from the New York Times web site and publication as RSS feed.
current selection and add it to the list of data items to be extracted (atthe right hand side in the figure). Once all data items of interest have beenidentified and named, Dapper allows the user to view a preview of the finalresult and to publish the extracted content in a variety of formats, e.g., asXML, RSS or even as Google gadget (see Section 5.5.3).
5.4.5 Micro-formats and linked data
Recognizing the value of the data that is published inside common HTML-formatted web pages, the lack of adequate data structures in HTML markup(allowing, for instance, a crawler to extract data from web pages), and thedi�culty of providing suitable data components for all data published onthe Web, a compromise emerged: micro-formats. Micro-formats [12] extend
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.4 Data Components 127
<div class="vcard"><div class="fn n">
<span class="family-name">Daniel</span><span class="given-name">Florian</span>
</div><div class="org">University of Trento</div><span class="street-address">Via Sommarive 5</span>,<span class="postal-code">38123</span><span class="locality">Povo (TN)</span>,<span class="country-name">Italy</span>
</div>
Fig. 5.9 An excerpt of HTML markup annotated with the hCard micro-format forannotating contact details. hCard uses class names to identify its elements.
HTML with annotation constructs that superimpose data structures andmeaning on top of HTML-formatted data elements, easing the extractionof data items from web pages. That is, micro-formats extend HTML withmeta-data that can be used to guide the data extraction process.
There are at least three di↵erent techniques to define micro-formats (actu-ally, the name “micro-format” is specific only to the first of these, but we usethe term to refer to all of the specifications collectively): micro-formats, anopen community e↵ort for the definition of open standards for micro-formats(http://microformats.org/); microdata, a set of annotation tags intro-duced by the W3C witch HTML 5 [145]; and RDF annotations, an annotationtechnique based on a lightweight RDF annotation format (RDFa) [4].
Many di↵erent annotation vocabularies, i.e., reference specifications thatassign meaning (semantics) to annotations, exist, e.g., for events, organiza-tions, people, products, movies, and the like. Schema.org (http://schema.org/) is, for instance, an initiative by Bing, Google, Yahoo! and Yandex thataims to provide a set of reference vocabularies, but developers are also freeto use their very own vocabularies.
The HTML markup code in Figure 5.9, for instance, has been anno-tated with the hCard micro-format (http://microformats.org/wiki/hcard), which is tailored to the annotation of contact information embed-ded in common web pages. The micro-format is a one-to-one representationof the vCard business card format defined by the IETF [101]. As the exam-ple shows, the micro-format does not alter the actual HTML formatting ofthe content and instead uses class names to identify data items and to givemeaning to them, according to the hCard vocabulary.
Another way of publishing data on the Web that is currently gaining mo-mentum is linked data [39, 137, 44] (also called Linking Open Data, LOD).The idea of linked data is to create a network of interlinked data items, inwhich related items refer to each other via links that can be navigated andthat allow one to easily find related data, given one data item. Data items (orthings) are identified by URIs, HTTP URIs dereference things, dereferencedthings are described in RDF [236], and the RDF description contains links toother things on the Web. RDFa [4] provides for the definition of linked data
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
128 5 Mashup Components
As
of S
epte
mbe
r 20
11
Mus
icBra
inz
(zitgi
st)
P20
Turism
o de
Zar
agoz
a
yovi
sto
Yaho
o!
Geo
Plan
et
YAG
O
Wor
ld
Fact
-bo
ok
El
Via
jero
Tour
ism
Wor
dNet
(W
3C)
Wor
dNet
(V
UA)
VIV
O U
F
VIV
O
Indi
ana
VIV
O
Cor
nell
VIA
F
URI
Bur
ner
Sus
sex
Rea
ding
Li
sts
Plym
outh
Rea
ding
Li
sts
Uni
Ref
Uni
Prot
UM
BEL
UK P
ost-
code
s
legi
slat
ion
data
.gov
.uk
Ube
rblic
UB
Man
n-he
im
TWC L
OG
D
Twar
ql
tran
spor
tda
ta.g
ov.
uk
Traf
fic
Sco
tlan
d
thes
es.
fr
Thes
au-
rus
W
totl.n
et
Tele
-gr
aphi
s
TCM
Gen
eD
IT
Taxo
nCon
cept
Ope
n Li
brar
y (T
alis
)
tags
2con
de
licio
us
t4gm info
Sw
edis
h O
pen
Cul
tura
l H
eritag
e
Sur
ge
Rad
io
Sud
oc
STW
RAM
EAU
SH
stat
istics
data
.gov
.uk
St.
And
rew
s Res
ourc
e Li
sts
ECS
Sou
th-
ampt
on
EPrint
s
SSW
Th
esau
rus
Sm
art
Link
Slid
esha
re2R
DF
sem
antic
web
.org
Sem
antic
Twee
t
Sem
antic
XBRL
SW
Dog
Fo
od
Sou
rce
Cod
e Ec
osys
tem
Li
nked
Dat
a
US S
EC
(rdf
abou
t)
Sea
rs
Sco
tlan
d G
eo-
grap
hy
Sco
tlan
dPu
pils
&Ex
ams
Sch
olar
o-m
eter
Wor
dNet
(R
KB
Expl
orer
)
Wik
i
UN
/LO
CO
DE
Ulm
ECS
(RKB
Expl
orer
)
Rom
a
RIS
KS
RES
EX
RAE2
001
Pisa
OS
OAI
NSF
New
-ca
stle
LAAS
KIS
TI
JISC
IRIT
IEEE
IBM
Euré
com
ERA
ePrint
sdo
tAC
DEP
LOY
DBLP
(R
KB
Expl
orer
)
Crim
e Rep
orts
U
K
Cou
rse-
war
e
CO
RD
IS
(RKB
Expl
orer
)CiteS
eer
Bud
apes
t
ACM
ries
e
Rev
yu
rese
arch
data
.gov
.uk
Ren
. En
ergy
G
ener
a-to
rs
refe
renc
eda
ta.g
ov.
uk
Rec
ht-
spra
ak.
nl
RD
Foh
loh
Last
.FM
(r
dfiz
e)
RD
F Boo
k M
ashu
p
Råd
ata
nå!
PSH
Prod
uct
Type
s O
ntol
ogy
Prod
uct
DB
PBAC
Poké
-pé
dia
pate
nts
data
.go
v.uk
Ox
Poin
ts
Ord
-na
nce
Sur
vey
Ope
nly
Loca
l
Ope
n Li
brar
y
Ope
nCyc
Ope
n Cor
po-
rate
s
Ope
nCal
ais
Ope
nEI
Ope
n El
ection
D
ata
Proj
ect
Ope
nD
ata
Thes
au-
rus
Ont
os
New
s Po
rtal
OG
OLO
D
Janu
sAM
P
Oce
an
Drilli
ng
Cod
ices
New
Yo
rk
Tim
es
NVD
ntnu
sc
NTU
Res
ourc
e Li
sts
Nor
we-
gian
M
eSH
ND
L su
bjec
ts
ndln
a
my
Expe
ri-
men
t
Ital
ian
Mus
eum
s
med
u-ca
tor
MARC
Cod
es
List
Man
-ch
este
r Rea
ding
Li
sts
Lotico
Wea
ther
Sta
tion
s
Lond
on
Gaz
ette
LOIU
S
Link
ed
Ope
n Col
ors
lobi
dRes
ourc
es
lobi
dO
rgan
i-sa
tion
s
LEM
Link
edM
DB
Link
edL
CCN
Link
edG
eoD
ata
Link
edCT
Link
edU
ser
Feed
back
LOV
Link
ed
Ope
n N
umbe
rs
LOD
E
Euro
stat
(O
ntol
ogy
Cen
tral
)
Link
ed
EDG
AR
(Ont
olog
yCen
tral
)
Link
ed
Cru
nch-
base
lingv
oj
Lich
field
Spe
n-di
ng
LIBRIS
Lexv
o
LCSH
DBLP
(L
3S)
Link
ed
Sen
sor
Dat
a (K
no.e
.sis
)
Kla
pp-
stuh
l-cl
ub
Goo
d-w
in
Fam
ily
Nat
iona
l Rad
io-
activi
ty
JP
Jam
endo
(D
Btu
ne)
Ital
ian
publ
ic
scho
ols
ISTA
T Im
mi-
grat
ion
iSer
ve
IdRef
Sud
oc
NSZL
Cat
alog
Hel
leni
c PD
Hel
leni
c FB
D
Pied
mon
tAcc
omo-
dation
s
Gov
Trac
k
Gov
WIL
D
Goo
gle
Art
w
rapp
ergnos
s
GES
IS
Geo
Wor
dN
et
Geo
Spe
cies
Geo
Nam
es
Geo
Link
edD
ata
GEM
ET
GTA
A
STI
TCH
SID
ER
Proj
ect
Gut
en-
berg
Med
iCar
e
Euro
-st
at
(FU
B)
EURES
Dru
gBan
k
Dis
ea-
som
e
DBLP
(F
U
Ber
lin)
Dai
lyM
ed
CO
RD
IS(F
UB)
Free
base
flick
r w
rapp
r
Fish
es
of T
exas
Finn
ish
Mun
ici-
palit
ies
ChE
MBL
FanH
ubz
Even
tM
edia
EUTC
Pr
oduc
-tion
s
Euro
stat
Euro
pean
a
EUN
IS
EU
Inst
i-tu
tion
s
ESD
st
an-
dard
s
EARTh
Enip
edia
Popu
la-
tion
(En
-AKTi
ng)
NH
S(E
n-AKTi
ng)
Mor
talit
y(E
n-AKTi
ng)
Ener
gy
(En-
AKTi
ng)
Crim
e(E
n-AKTi
ng)
CO
2 Em
issi
on(E
n-AKTi
ng)
EEA
SIS
VU
educ
atio
n.da
ta.g
ov.u
k
ECS
Sou
th-
ampt
on
ECCO
-TC
P
GN
D
Did
acta
lia
DD
CD
euts
che
Bio
-gr
aphi
e
data
dcs
Mus
icBra
inz
(DBTu
ne)
Mag
na-
tune
John
Pe
el
(DBTu
ne)
Cla
ssic
al
(DB
Tune
)Aud
ioScr
obbl
er
(DBTu
ne)
Last
.FM
ar
tist
s (D
BTu
ne)
DB
Trop
es
Port
u-gu
ese
DBpe
dia
dbpe
dia
lite
Gre
ek
DBpe
dia
DBpe
dia
data
-op
en-
ac-u
k
SM
CJo
urna
ls
Poke
dex
Airpo
rts
NASA
(Dat
a In
cu-
bato
r)
Mus
icBra
inz
(Dat
aIn
cuba
tor)
Mos
eley
Fo
lk
Met
offic
e W
eath
er
Fore
cast
s
Dis
cogs
(D
ata
Incu
bato
r)
Clim
bing
data
.gov
.uk
inte
rval
s
Dat
a G
ov.ie
data
bnf.fr
Cor
nett
o
reeg
le
Chr
onic
-lin
g Am
eric
a
Che
m2
Bio
2RD
FCal
ames
busi
ness
data
.gov
.uk
Brick
link
Bra
zilia
n Po
li-tici
ans
BN
B
Uni
STS
Uni
Path
way
Uni
Parc
Taxo
nom
yU
niPr
ot(B
io2R
DF)
SG
D
Rea
ctom
e
PubM
edPu
bChe
m
PRO
-SIT
EPr
oDom
Pfam
PDB
OM
IMM
GI
KEG
G
Rea
ctio
n
KEG
G
Path
way
KEG
G
Gly
can
KEG
G
Enzy
me
KEG
G
Dru
g KEG
G
Com
-po
und
Inte
rPro
Hom
olo
Gen
e
HG
NC
Gen
e O
ntol
ogy
Gen
eID
Affy-
met
rix
bibl
e on
tolo
gy
Bib
Bas
e
FTS
BBC
Wild
life
Find
er
BBC
Prog
ram
mes
BBC
Mus
ic
Alp
ine
Ski
Aus
tria
LOCAH
Am
ster
-da
m
Mus
eum
AG
RO
VO
C
AEM
ET
US C
ensu
s (r
dfab
out)
Fig. 5.10 The Linking Open Data cloud diagram by Richard Cyganiak andAnja Jentzsch (http://lod-cloud.net/) visualizing the interrelations among thedatasets that have been published in Linked Data format.
by annotating data items inside HTML markup, bridging between linked dataand micro-formats.
128
5MashupComponen
ts
As of September 2011
MusicBrainz
(zitgist)
P20
Turismo de
Zaragoza
yovisto
Yahoo! Geo
Planet
YAGO
World Fact-book
El ViajeroTourism
WordNet (W3C)
WordNet (VUA)
VIVO UF
VIVO Indiana
VIVO Cornell
VIAF
URIBurner
Sussex Reading
Lists
Plymouth Reading
Lists
UniRef
UniProt
UMBEL
UK Post-codes
legislationdata.gov.uk
Uberblic
UB Mann-heim
TWC LOGD
Twarql
transportdata.gov.
uk
Traffic Scotland
theses.fr
Thesau-rus W
totl.net
Tele-graphis
TCMGeneDIT
TaxonConcept
Open Library (Talis)
tags2con delicious
t4gminfo
Swedish Open
Cultural Heritage
Surge Radio
Sudoc
STW
RAMEAU SH
statisticsdata.gov.
uk
St. Andrews Resource
Lists
ECS South-ampton EPrints
SSW Thesaur
us
SmartLink
Slideshare2RDF
semanticweb.org
SemanticTweet
Semantic XBRL
SWDog Food
Source Code Ecosystem Linked Data
US SEC (rdfabout)
Sears
Scotland Geo-
graphy
ScotlandPupils &Exams
Scholaro-meter
WordNet (RKB
Explorer)
Wiki
UN/LOCODE
Ulm
ECS (RKB
Explorer)
Roma
RISKS
RESEX
RAE2001
Pisa
OS
OAI
NSF
New-castle
LAASKISTI
JISC
IRIT
IEEE
IBM
Eurécom
ERA
ePrints dotAC
DEPLOY
DBLP (RKB
Explorer)
Crime Reports
UK
Course-ware
CORDIS (RKB
Explorer)CiteSeer
Budapest
ACM
riese
Revyu
researchdata.gov.
ukRen. Energy Genera-
tors
referencedata.gov.
uk
Recht-spraak.
nl
RDFohloh
Last.FM (rdfize)
RDF Book
Mashup
Rådata nå!
PSH
Product Types
Ontology
ProductDB
PBAC
Poké-pédia
patentsdata.go
v.uk
OxPoints
Ord-nance Survey
Openly Local
Open Library
OpenCyc
Open Corpo-rates
OpenCalais
OpenEI
Open Election
Data Project
OpenData
Thesau-rus
Ontos News Portal
OGOLOD
JanusAMP
Ocean Drilling Codices
New York
Times
NVD
ntnusc
NTU Resource
Lists
Norwe-gian
MeSH
NDL subjects
ndlna
myExperi-ment
Italian Museums
medu-cator
MARC Codes List
Man-chester Reading
Lists
Lotico
Weather Stations
London Gazette
LOIUS
Linked Open Colors
lobidResources
lobidOrgani-sations
LEM
LinkedMDB
LinkedLCCN
LinkedGeoData
LinkedCT
LinkedUser
FeedbackLOV
Linked Open
Numbers
LODE
Eurostat (OntologyCentral)
Linked EDGAR
(OntologyCentral)
Linked Crunch-
base
lingvoj
Lichfield Spen-ding
LIBRIS
Lexvo
LCSH
DBLP (L3S)
Linked Sensor Data (Kno.e.sis)
Klapp-stuhl-club
Good-win
Family
National Radio-activity
JP
Jamendo (DBtune)
Italian public
schools
ISTAT Immi-gration
iServe
IdRef Sudoc
NSZL Catalog
Hellenic PD
Hellenic FBD
PiedmontAccomo-dations
GovTrack
GovWILD
GoogleArt
wrapper
gnoss
GESIS
GeoWordNet
GeoSpecies
GeoNames
GeoLinkedData
GEMET
GTAA
STITCH
SIDER
Project Guten-berg
MediCare
Euro-stat
(FUB)
EURES
DrugBank
Disea-some
DBLP (FU
Berlin)
DailyMed
CORDIS(FUB)
Freebase
flickr wrappr
Fishes of Texas
Finnish Munici-palities
ChEMBL
FanHubz
EventMedia
EUTC Produc-
tions
Eurostat
Europeana
EUNIS
EU Insti-
tutions
ESD stan-dards
EARTh
Enipedia
Popula-tion (En-AKTing)
NHS(En-
AKTing) Mortality(En-
AKTing)
Energy (En-
AKTing)
Crime(En-
AKTing)
CO2 Emission
(En-AKTing)
EEA
SISVU
education.data.g
ov.uk
ECS South-ampton
ECCO-TCP
GND
Didactalia
DDC Deutsche Bio-
graphie
datadcs
MusicBrainz
(DBTune)
Magna-tune
John Peel
(DBTune)
Classical (DB
Tune)
AudioScrobbler (DBTune)
Last.FM artists
(DBTune)
DBTropes
Portu-guese
DBpedia
dbpedia lite
Greek DBpedia
DBpedia
data-open-ac-uk
SMCJournals
Pokedex
Airports
NASA (Data Incu-bator)
MusicBrainz(Data
Incubator)
Moseley Folk
Metoffice Weather Forecasts
Discogs (Data
Incubator)
Climbing
data.gov.uk intervals
Data Gov.ie
databnf.fr
Cornetto
reegle
Chronic-ling
America
Chem2Bio2RDF
Calames
businessdata.gov.
uk
Bricklink
Brazilian Poli-
ticians
BNB
UniSTS
UniPathway
UniParc
Taxonomy
UniProt(Bio2RDF)
SGD
Reactome
PubMedPub
Chem
PRO-SITE
ProDom
Pfam
PDB
OMIMMGI
KEGG Reaction
KEGG Pathway
KEGG Glycan
KEGG Enzyme
KEGG Drug
KEGG Com-pound
InterPro
HomoloGene
HGNC
Gene Ontology
GeneID
Affy-metrix
bible ontology
BibBase
FTS
BBC Wildlife Finder
BBC Program
mes BBC Music
Alpine Ski
Austria
LOCAH
Amster-dam
Museum
AGROVOC
AEMET
US Census (rdfabout)
Fig.
5.10
The
Linking
Open
Data
cloud
diagram
by
Rich
ard
Cyganiak
and
Anja
Jen
tzsch(http://lod-cloud.net/)visu
alizingth
einterrela
tionsamongth
edatasets
thathave
been
publish
edin
Linked
Data
form
at.
byan
notatin
gdata
itemsinsid
eHTMLmarku
p,b
ridgin
gbetw
eenlin
keddata
andmicro-form
ats.
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.5 User Interface Components 131
Source file
NameFormat
Java portlet
Name
Package
contains
Configuration document
1..N
deployed as
2
Operation
init
destroy
processAction
renderproduces1..N
1..N
Portlet mode
Content
User interface
encodes1..N
represents
has
1..N
implements
1..N
1..N
Fig. 5.11 Model of a Java portlet UI component according to JSR 168 [1].
5.5.2 Java portlets
Java portlets (JSR 168 [1]) are Java-based UI components that are able toprocess user requests and to generate dynamic content. They are simple,server-side web applications developed to fit into small areas of a web page(not the whole browser window), so as to facilitate placing multiple portletsnext to each other inside a single web page. Portlets are typically used byportals (we discuss them in the next chapter) as pluggable user interfacecomponents that provide a reusable presentation layer.
We provide our conceptual view on portlets in Figure 5.11. As we can seefrom the model, portlets are peculiar as UI components, in that they do notreally implement an own UI themselves. Instead, portlets are Java objectsthat expose a set of operations for life cycle management (e.g., init anddestroy) and interaction (processAction and render). The operationsimplement synchronous local procedure calls. The processAction opera-tion allows users to trigger state changes of the portlet (e.g., to store data).The render operation generates content, the so-called fragments, which arepieces of markup (e.g., HTML, XHTML, WML media types) adhering tocertain rules that can be aggregated with other fragments, but that requireintegration into a document (e.g., an HTML page), in order to be renderedto their users. Portlets generating HTML fragments must not make use ofthe HTML tags base, body, iframe, frame, frameset, head, html andtitle, in order to support their correct integration into portals. The userinterface of the portlet therefore consists of a set of markup fragments. Dif-ferent user interfaces are tailored to the portal view modes (view, edit andhelp), corresponding to the actions the users can perform on the portlet.
Portlets are similar to Java servlets [86], but they cannot be executed andaccessed independently; they depend on a so-called portlet container for the
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
132 5 Mashup Components
Source file
NameFormat
Java portlet
Name
Package
contains
Configuration document
1..N
deployed as
2
Operation
init
destroy
processAction
processEvent
produces
1..N
1..N
Portlet mode
Content
User interface
encodes1..N
represents
has
1..N
implements
1..N
1..N
render
Session parameter
NameValue
Render parameter
NameValue
Event
NameObject
shares
0..N 0..N 0..N
Fig. 5.12 Extended model of a Java portlet according to JSR 286 [143] with a thepossibility to share session and render parameters, to launch events, and to processevents via the processEvent operation.
mangement of their life cycle and their execution inside another web page.The portlet container also manages the deployment of the portlet (local tothe container), which comes as package containing the source files implement-ing the portlet. Two configuration documents (web.xml for web resourcesand portlet.xml for portlet-related resources) configure the portlet. Beinginstantiated inside the portlet container, portlets are typically stateful. Theorder of invocation of the operations depends on the user interface exposedto the users, which enact them by interacting with the UI.
The original JSR 168 specification su↵ered of two main shortcomings. Firstportlets lacked any mechanism for the inter-portlet communication, i.e., forcommunication between two portlets inside a portal. Developers had thereforeto implement own extensions (e.g., using the so-called portal context) if therewas a need to have portlets interact with each other. Second, only portletsthat were installed locally in the portlet container could be used in the portalrunning on top of the container.
The JSR 286 specification [143] (portlet specification version 2.0) eventu-ally provided an answer to the first shortcoming by introducing three di↵er-ent techniques:shared session and render parameters for the sharing of simpleparameter-value pairs with user session and navigation information and port-let events for the sharing of generic Java objects (see Figure 5.12). The WebServices for Remote Portlets specification [264] provided an answer to the
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.5 User Interface Components 133
second shortcoming by standardizing a protocol for the interaction with re-mote portlets accessed via SOAP web services. The integration of remoteportlets still occurs inside the web server running the portal.
Today, portlets are therefore described in one of two ways: either via anXML deployment descriptor for locally deployed portlets or via a WSDLdescriptor for remote portlets.
5.5.3 Widgets and gadgets
One recently standardized UI componentization technology for the clientside is W3C widgets [272], which are similar to OpenSocial gadgets [219],Google gadgets (https://developers.google.com/gadgets/) or Ya-hoo! widgets (http://widgets.yahoo.com/, discontinued since April2012). W3C widgets (short, widgets) are simple, but full-fledged, client-sideweb applications that are similar in appearance to Java portlets. Widgets areJavaScript-based and, hence, locally running applications; the use of AJAX,however, allows them to interact with own server-side application logic, pos-sibly providing them with advanced computing or data features.
Source file
Name
Format
W3C widget
NameURL
1..N
provides access to
User interface Packageimplements
defines
contains
Configuration document
Widget interface
implements
1..N
Metadata
deployed as
1..N
1..N
Fig. 5.13 Model of a W3C widget. In white the actual component model; in graythe artifacts of the component.
The component model of widgets is very simple (see Figure 5.13: A widgethas a name and a URL from which it can be downloaded, and it consists of auser interface for the human user (standard HTML pages) and of a JavaScriptAPI (the so-called widget interface) for programmatic access to widget meta-data. Contrary to portlets, widgets manage their UIs autonomously throughdynamic HTML inside the web browser. The design of a widget’s user inter-face is rather standard web development; the widget family of specifications
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
134 5 Mashup Components
Source file
Name
Format
W3C widget
NameURL
1..N
provides access to
User interface Packageimplements
defines
contains
Configuration document
Widget interface
implements
1..N
Metadata
deployed as
1..N
1..N
Event Event handler
0..N 0..N0..N 0..N
Parameter
Name
Type
emits has
consumescarries
Fig. 5.14 Model of a W3C widget with inter-widget communication extension [277].
[272] rather focuses on the packaging and configuration of widgets, that is,on how widgets can be distributed over the Web and configured for localexecution. As illustrated in Figure 5.13, a widget is deployed as a package(a common ZIP archive file) that contains the source files of the widgets,such as HTML pages, CSS style sheets, JS files, images, etc. A configurationdocument (config.xml) contains widget meta-data, such as name spaces,version number, size, view modes, name, author, required features, a descrip-tion, and similar.
The Widget Packaging and XML Configuration specification [52] tells howthe configuration file is formatted and how files are structured into foldersand packaged. This is important, because a client that wants to instanti-ate a widget must know how to work with the package. The specificationalso specifies a set of nine steps for processing and locally deploying wid-get packages. Since the procedure is relatively cumbersome, widget runtimesor containers, such as Apache Rave (http://rave.apache.org/), easethe management and instantiation of widgets inside the client browser. Yet,widgets may also require some supporting features, i.e., software componentsthe widget can use at runtime (e.g., geo-localization), to be provided ei-ther by the widget container at the client side or by a so-called widget en-gine at the server side. A widget engine, such as Apache Wookie (http://incubator.apache.org/wookie/), is a server application that allowsone to upload and deploy widgets, so that they are available for their instan-tiation and use in client applications. That is, for widgets the container andthe engine are distributed over client and server, while for portlets both thesefunctionalities are located on the server.
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
136 5 Mashup Components
Source file
Name
Format
mashArt UI component
Name
URL
1..N
User interface
instantiates
Event
Event handler0..N
0..N 0..N
0..N
Parameter
Name
Type
emits
has
consumes
produces
Constructor
has
consumes0..N
deployed
as
Fig. 5.15 The mashArt UI component model for UI components extracted from webapplications annotated with the mashArt Event Annotation [96].
An assisted approach to UI component extraction that makes use of micro-formats to annotate source applications is proposed in [96]. The approachuses an abstract component descriptor in the mashArt Description Language(MDL) [90] to describe an applications inter-communication capabilities, amicro-format (the mashArt Event Annotation - MEA) for the annotation ofthe application with events and operations, and a generic wrapper structureable to support the runtime componentization of the application, producingUI components according to the model illustrated in Figure 8.2, which isconceptually similar to the one of JavaScript UI components (see Figure5.4).
5.6 Real-Time Streaming Components
The last type of components we discuss are real-time streaming components,e.g., for the communication of audio and video streams. So far, all compo-nent types described featured discrete media types (for remote components)or function outputs (for local components). Some of the technologies de-scribed, for instance, SOAP web services or JavaScript APIs, could also beused for the implementation of “streams” of data chunks (e.g., sensor readingscoming form a wireless sensor network) by periodically sending data from asource component to a sink (e.g., using notification operations or events).We use quotation marks to emphasize the di↵erence of this kind of streamfrom the streams we discuss in this section, which instead are based on anapplication-layer protocol that manages the stream of data on behalf of thecomponent and that is alternative to HTTP (which is instead used by theremote components discussed so far).
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
5.6 Real-Time Streaming Components 137
Media stream
NameURL
Control signal
Media resource
Streaming channel 0..N
Parameter
Name
Type
has
controls state of
1..N
1..Nprovides access to
uses
Control signals control the state of
the streaming channel. Which signal
can be used when depends on the
state of the channel.
Fig. 5.16 Model of a real-time multimedia streaming component.
5.6.1 Multimedia resources
The simplest type of streaming components we know from the Web are mul-timedia resources like audio or video files that can be embedded into webpages and reproduced inside the client browser. This is common practicetoday, YouTube being one of the best examples of application for the distri-bution and viewing of videos.
Multimedia resources are di↵erent from other web-accessible resources,such as HTML or XML files, in that they are intrinsically stateful. In fact, ifwe neglect the option of downloading the media resource before reproducing itin the browser (which is not always feasible), accessing the resource typicallystarts the respective data stream from the server hosting the resource tothe client viewing it. This stream needs control. For instance, it is commonthat we pause a video, jump forward or backward, start it again, or stop itcompletely.
From a conceptual point of view, multimedia resources therefore complywith a component model like the one illustrated in Figure 5.16, which de-scribes a media stream as composed of two core parts, the actual mediaresource and the set of control signals (operations) that allow the client tocontrol the behavior of the stream. The actual stream occurs via a dedicatedstreaming channel, e.g., implementing streaming media protocol like RTSP[251] or RTP [250]; control signals may still be sent over HTTP. The order ofwhich control signals can be used when depends on the state of the streamingchannel, e.g., we can pause a stream only after starting it.
In order to reproduce a multimedia resource inside a web page, we havedi↵erent options:
• Media player : We can use a dedicated media player implementing to nec-essary streaming protocol and taking over the management of the stream.These media players typically allow human users to control the mediaresource being reproduces and they provide programmatic access to thesame capabilities. These kinds of media players are typically implementedin technologies like Flash, Silverlight, or Java applets. The programmatic
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
140 5 Mashup Components
Telco stream
ID
Control signal
Streaming channel 0..N
Parameter
Name
Type
has
controls state of
1..N
1..N
provides access to
uses
Session
ID
1..N
0..N1..N
Control signals control the state of
the streaming channel (one per
participant). Which signal can be
used when depends on the state of
the respective channel.
The session identifier must be
shared with all participants to
allow them to connect to the
right instance of conference.
Fig. 5.17 Simplified model of a telco streaming component for audio/video confer-encing with a shared session and multiple participants (streaming channels).
5.7 Summary and Bibliographic Notes
It is simply not possible to condense into fewer pages all the di↵erent com-ponent technologies and models, which are the foundation of the mashupecosystem and strongly influence the content of the chapters to follow. Inthis chapter we tried to describe the technologies as concisely as possible, yetthe number of technologies to describe just grew during the chapter writing– and we were not able to discuss all of them. However, this is exactly whatcharacterizes mashups in the first place, i.e., the huge variety and hetero-geneity of component technologies and models one may have to master andconciliate when developing a mashup.
The goal of this chapter is therefore to introduce the reader to a repre-sentative spectrum of component technologies and to highlight conceptualsimilarities and di↵erences. For this purpose, we introduced a set of compo-nent characteristics that capture the most important aspects about how touse components. Figure 5.18 recalls the characteristics.
Component
Operation User interface
0..N{or}
0..1
1. Component type
2. Runtime location
3. Invocation model
4. Transmission style
6. Instantiation model
5. Media type
7. Invocation order
Co
mp
on
ent
char
acte
rist
ics
Fig. 5.18 Summary of the component characteristics that impact on the way com-ponents are to be used and, hence, on how they can be integrated with each other.
© Copyright 2014 by F. Daniel and M. Matera. Reproduction for classroom use and teaching allowed if source is properly cited.
140 5 Mashup Components
Telco stream
ID
Control signal
Streaming channel 0..N
Parameter
Name
Type
has
controls state of
1..N
1..N
provides access to
uses
Session
ID
1..N
0..N1..N
Control signals control the state of
the streaming channel (one per
participant). Which signal can be
used when depends on the state of
the respective channel.
The session identifier must be
shared with all participants to
allow them to connect to the
right instance of conference.
Fig. 5.17 Simplified model of a telco streaming component for audio/video confer-encing with a shared session and multiple participants (streaming channels).
5.7 Summary and Bibliographic Notes
It is simply not possible to condense into fewer pages all the di↵erent com-ponent technologies and models, which are the foundation of the mashupecosystem and strongly influence the content of the chapters to follow. Inthis chapter we tried to describe the technologies as concisely as possible, yetthe number of technologies to describe just grew during the chapter writing– and we were not able to discuss all of them. However, this is exactly whatcharacterizes mashups in the first place, i.e., the huge variety and hetero-geneity of component technologies and models one may have to master andconciliate when developing a mashup.
The goal of this chapter is therefore to introduce the reader to a repre-sentative spectrum of component technologies and to highlight conceptualsimilarities and di↵erences. For this purpose, we introduced a set of compo-nent characteristics that capture the most important aspects about how touse components. Figure 5.18 recalls the characteristics.
Component
Operation User interface
0..N{or}
0..1
1. Component type
2. Runtime location
3. Invocation model
4. Transmission style
6. Instantiation model
5. Media type
7. Invocation order
Co
mp
on
ent
char
acte
rist
ics
Fig. 5.18 Summary of the component characteristics that impact on the way com-ponents are to be used and, hence, on how they can be integrated with each other.