the directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf ·...

26
The Directorscut: a solution to collaborative multimedia management Silvia Mirri & Ludovico A. Muratori & Marco Roccetti & Paola Salomoni Published online: 25 May 2010 # Springer Science+Business Media, LLC 2010 Abstract Web 2.0 applications allow rich media contents to be exposed and shared by users. Nevertheless, usually, a multimedia is provided as an unicum, made by synchronized media items. Sound tracks, video sequences, captions, cannot be customized on-the-flyby users. Managing multimedia in a deep way would meet the expectations of nowadays Web prosumers (i.e. producers and consumers), and it would widen the audience. Describing and synchronizing each medium, as well as specifying different alternative contents for it, are the keystones of multimedia customization and of audience widening. This paper presents a multimedia collaborative system, which provides support to the arrangement of medium into a multi-views composed multimedia. Each prosumer can add medium by juxtaposition or by defining it as an alternative (audio, video, textual) version of an existing one. The implementation of such a system is based on SMIL 3.0 specification but implements a new and compact syntax to let users manipulate the original multimedia synchronization and their alternatives. The proposed approach has been put to test in two different scenarios. Keywords Multimedia systems . Multimedia computing . Multimedia accessibility . Multimedia collaborative editing 1 Introduction Although multimedia objects are being used and shared extensively by Web communities, such multimedia objects cannot yet be managed by users within the Web community, in a cooperative way. Multimed Tools Appl (2011) 53:319344 DOI 10.1007/s11042-010-0533-z S. Mirri (*) : L. A. Muratori : M. Roccetti : P. Salomoni Department of Computer Science, University of Bologna, Bologna, Italy e-mail: [email protected] L. A. Muratori e-mail: [email protected] M. Roccetti e-mail: [email protected] P. Salomoni e-mail: [email protected]

Upload: others

Post on 22-Aug-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

The Directors’ cut: a solution to collaborativemultimedia management

Silvia Mirri & Ludovico A. Muratori & Marco Roccetti &Paola Salomoni

Published online: 25 May 2010# Springer Science+Business Media, LLC 2010

Abstract Web 2.0 applications allow rich media contents to be exposed and shared by users.Nevertheless, usually, a multimedia is provided as an unicum, made by synchronized mediaitems. Sound tracks, video sequences, captions, cannot be customized “on-the-fly” by users.Managing multimedia in a deep way would meet the expectations of nowadays Webprosumers (i.e. producers and consumers), and it would widen the audience. Describing andsynchronizing each medium, as well as specifying different alternative contents for it, are thekeystones of multimedia customization and of audience widening. This paper presents amultimedia collaborative system, which provides support to the arrangement of medium intoa multi-views composed multimedia. Each prosumer can add medium by juxtaposition or bydefining it as an alternative (audio, video, textual) version of an existing one. Theimplementation of such a system is based on SMIL 3.0 specification but implements a newand compact syntax to let users manipulate the original multimedia synchronization and theiralternatives. The proposed approach has been put to test in two different scenarios.

Keywords Multimedia systems . Multimedia computing . Multimedia accessibility .

Multimedia collaborative editing

1 Introduction

Although multimedia objects are being used and shared extensively by Web communities,such multimedia objects cannot yet be managed by users within the Web community, in acooperative way.

Multimed Tools Appl (2011) 53:319–344DOI 10.1007/s11042-010-0533-z

S. Mirri (*) : L. A. Muratori :M. Roccetti : P. SalomoniDepartment of Computer Science, University of Bologna, Bologna, Italye-mail: [email protected]

L. A. Muratorie-mail: [email protected]

M. Roccettie-mail: [email protected]

P. Salomonie-mail: [email protected]

Page 2: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Once some rich media content are uploaded onto a Web portal, just few friendlyinterfaces are now provided to make it editable. This paper presents a solution to allowcollaborative editing of multimedia resources synchronization, by preserving the necessarysimplicity and effectiveness on a Web 2.0 context.

In general, multimedia implies interaction with complex processes to acquireinformation content, either to make some kind of synthesis, or to obtain some sort ofknowledge. Such processes are subject to users’ needs; furthermore, they are affected bydevice capabilities as well as cultural and social imperatives [24]. The need for supportingand classifying multimedia information becomes essential for a community of users whoactively share such content on the Web [18, 26].

The first step towards building a better understanding of multimedia (including thecapability to acquire, classify and synthesize by the users on the basis of some criteria)requires systems and technologies that allow the capture of multimedia content and itsfragmentation into simpler synchronized media. The inverse operation is thecomposition of medium and its result is the original multimedia content, calledcompound multimedia. The second step involves the ability of providing alternatives tothe original content—or part of them—to permit each author to comment, re-arrange orsimply adjust various media parts. Improving the ability of acquiring and manipulatingmultimedia contents within a prosumers community is becoming essential to enhancethe interaction between community members. The more a set of multimedia content ismanageable, classifiable and understandable, the more it will become a unique productof the community. The ability to add alternative contents (e.g. annotations entered ascaptions and audio descriptions, as well as textual alternatives to non-textual originalmedia) is also required to permit the creation of fully accessible media, which can beoffered to users with disabilities without any loss of meaning. Providing multipleversions of media and highly interactive media improve the level of multimediaaccessibility and the capability of personalization. In this way, the audience/users canchoose the version they prefer and every time they access that multimedia presentationthey can enjoy new and different content.

Besides the means to specify media and their mutual synchronizations, the new versionof SMIL [28] has introduced some new features (through the CustomeTestAttributesmodule [29]) allowing the declaration of author-defined variables. These elements workas triggers that automatically induce the replacement of media contents, as specified bythe authors. This new SMIL feature is defined as a general purpose set of controls onauthor-defined boolean values, and therefore, it cannot be easily used by the prosumers.To meet this goal, we have introduced a lighter and more compact syntax for definingand managing alternative contents. This SMIL extension is strongly inspired by a well-known metadata standard, the IMS AccessForAll Meta-data (ACCMD, [11]) used todefine primary and alternative resources with accessibility options. Our approach alsointroduces the possibility of defining multi-valued triggers to simplify the (otherwise)complex code needed to manage real situations with boolean values. Multimedia contentsdefined by using our syntax are automatically converted into a valid (and equivalent)SMIL 3.0 file, which can be opened by using a compliant multimedia player [32].

The proposed system has been tested for two different scenarios. First, we will considera media composition by a community which originates a multi-view multimedia as theresult of cooperative re-arrangement of an original multimedia.

A second scenario presents a community collectively involved in improving theaccessibility of multimedia items (audio + video), mutually providing all the alternativecontents necessary to let it meet the needs of people with disabilities.

320 Multimed Tools Appl (2011) 53:319–344

Page 3: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

The remainder of the paper is organized as follows: Section 2 depicts the two abovementioned scenarios of collaborative multimedia editing and introduces our model aims.Section 3 presents some related work. Section 4 describes SMIL CustomeTestAttributesmodule, details the SMIL grammar extension in order to provide appropriate markup andmultimedia data description and compares such two approaches. Section 5 presents someissues related to the collaborative editing system we have implemented starting from anopen source wiki platform and a multimedia player we have improved in order to supportour SMIL grammar extension. Section 6 and Section 7 illustrate the two different abovementioned scenarios: Section 6 describes a scenario of a compound multimediacollaboratively edited by exploiting our SMIL extension, while Section 7 shows how suchan extension meets accessibility issues. Finally Section 8 concludes the paper.

2 Surfing surf’s up

In order to introduce our system to compound multimedia and to provide alternativeversions, both really enjoyable, sharable and manageable by a Web community, let usconsider the following hypothetical scenarios.

We take into account the movie “Surf’s Up” (produced in 2007 by Sony PicturesAnimation), a computer-animated film, shooted as a pseudo-documentary (or mockumen-tary) about a crew of penguin surfers. Scenes depict the (possible) backstage behind theinterviews made to the players, their performances among the waves and some cross-sections of their life, through a sort of uncut version of all the hypothetically filmedmaterial. Surf’s Up represents a notable example of a multimedia which looks like a sort ofraw material to be cut and edited. Furthermore, the “mockumentary” form fits to the workof a hypothetical community of scenarists, who can discretionally choose any plot, ratherthan be constrained to one subject (as it is in a movie novel).

As a first scenario, let us suppose a Web community is engaged in gathering some videosequences from the original movie, cutting off some other ones, putting into a sequencesome parts of the interviews audio tracks, adding a narrator and/or a sound track, as well assubtitles or captions, with the aim to build a sort of new movie, closer to a traditional one.Moreover, they would have to take into account to collaboratively rewrite the plot andscenery, by sharing on the Web raw and intermediate versions of their work. A grainedversion of the whole movie, composed by a sequence of scenes, as a video and audiotracks, represents a very first solution to let them make some kind of editing. Some (trulytrivial) tools to establish the duration of each scene, either its video track or its audio one,are needed to plan a new arrangement, as well as, some (less trivial) equipments to cut andpaste the selected clips. Furthermore, other tools are needed in order to add subtitles orcaptions, a non original sound track and so on. Finally, all these tools and equipments, mustsupport some kind of versioning, in order to allow the rollback of every change, update oraddition to the community users which are involved in the movie-making community.

In a second alternative scenario, the same Web community could add medium ormultimedia items, in order to provide alternatives to pieces of the original multimedia, forinstance captions as alternative to audio with speech, sequence of images or audiodescription as alternative to video, etc. All these media could be properly synchronized inorder to offer a new customizable multimedia, to meet each user’s needs.

Our system proposes and compares the use of the SMIL CustomTestAttributes (CTA)module and a syntax-extended SMIL compliant presentation of the original movie,manageable in a suitable way as a “syntactic sugar” by a Web community, where syntactic

Multimed Tools Appl (2011) 53:319–344 321

Page 4: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

sugar is syntax designed to make a language easier to read or to express for humans. Inparticular, a wiki platform will allow users to edit each media component that SMILsingularly describes. Every change (cut, addition and modification) will be codified as analternative to the original media component, according to the features of CTA or of ourSMIL syntax extension and it will be embedded into the SMIL presentation. A transcodingsystem will be used to provide video and audio sub-sequences, which are the result of usersrewriting. Needless to say that both the two approaches preserve the original version of themovie and minimize the amount of data and player processing, necessary to present all theproduced versions.

Each new added medium could be an alternative to the original ones, according to asuitable identification code which is embedded on the SMIL presentation, either exploitingthe CTA or the extended syntax. According to the choice of the user, the result of editing isplayed out as a codified new compound multimedia.

3 Related work

Recent trends in representing and managing multimedia content on the Web, as it happenson GoogleVideo [8], mySpace [17], Facebook [5], and other Web 2.0 applications (such asblogs, wikis, etc.), point out a notable instance about how to manage such content beyondthe mere upload and tag. Obviously, the above mentioned systems provide users with newopportunities for distributing contents to a wide community. Unfortunately, they do notoffer cooperation features that allow other users to change the published content (e.g., byadding captions). Furthermore, they do not permit any collaborative participation during themedia authoring process and multimedia metadata life cycle. Such features could also offera way to provide content which could be properly customized in order to meet users withdisabilities needs, whenever they are enjoyed through a multimedia player with adaptationmechanisms and tools.

In this scenario, the unique exception is YouTube [33], which allows users to addannotations to their own videos. Such annotations could be useful to add backgroundinformation to the videos, to create different stories by letting YouTube users to choosescene by scene, to provide links inside a video to connect other related video on YouTube.Users can define the annotations position in their own video region and the annotation andthe video synchronization. When a YouTube user enjoys a video equipped withannotations, he/she can choose to disable them or not, by the proper video player interfacefeature. Unfortunately, unlike our approach, this functionality is not designed to allowcollaboration among users, in fact a user can add annotations and can synchronize mediaonly for videos the user has uploaded him/herself.

The CODAC Project [14] handles content and MPEG 7 based metadata in order toprovide facilities for content adaptation, search and retrieval, but no collaborative editingfeature is offered, on the contrary of our work. In [9], the authors propose a framework formanaging multimedia content, which effectively deals with metadata by adding new tagsand vocabularies to them and relating each others. However, the collaboration aspect of theauthoring process is not taken into account, while this feature is a main goal of ourproposal.

Let us now consider multimedia collaborative editing, which is the aim of [7]. Theauthors present a framework, on which content classification and management can beimplemented on the strength of their metadata, such as social relationship. Such a workaims to preserve accuracy, availability and personalization of provided contents. This

322 Multimed Tools Appl (2011) 53:319–344

Page 5: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

framework enables the development of multimedia authoring systems focused onmultimedia creation, distributed user collaboration and content retrieval. In particular,authors of [7] built a multimedia blogging portal (called “Online Community Life, OCL”),representing an interesting example of how social networking can be coupled withmultimedia authoring. However, the users’ collaboration aspect is mainly limited tomultimedia tagging and rating, unlike our work which allows users to create newmultimedia objects by working together.

Some (indirectly) collaborative aspects are taken into account in [20]. Here, users areallowed to elaborate existing video content and to annotate them by means of a desktopapplication. Such content can be uploaded on a dedicated server, to be downloaded and re-elaborated by other users, in their turn. This process clearly happens neither on line, nor ona web-based interface, which, instead, is proposed in our system and in [22]. This workdescribes a platform for community-supported media annotation and remix. At the timeauthors were writing, it supported mp3 and Flash formats and users could browse editedmultimedia by others, without re-elaborating them, limiting users’ collaboration, on thecontrary of our proposal. Users’ collaboration is on the basis of [3], which presents anapplication for media content selection, re-organization and asynchronous sharing andenhancing within a users’ community. Unfortunately, this application aims to allow authors/users to upload, describe and tag a media, while audience/users can only recommend,comment and rate a media. So, unlike the system we propose, users cannot collaborate increating new multimedia objects by working together, but collaboration actions offered bythe application in [3] are limited. The paper [21] describes a webservices-based system toauthoring, personalizing and sharing multimedia presentations. The main idea is to equipusers with a cooperative development environment for managing the fusion of originalmultimedia content. In particular, the system allows users to annotate multimediapresentation and to synchronize such presentation with their comments, to enrich availablemultimedia presentation with additional materials, to share their personalized media withother users. Such a system provides users with interesting sharing aspects, whilecollaboration is still limited, on the contrary of our proposal.

In [1] authors present a user interface model and its implementation for exploiting next-generation interactive capabilities with television content. The aim of this work isenhancing users’ experience by allow them to access additional content, which can beprovided by other users. In order to enrich such TV content, authors supplied an extendedSMIL grammar and suitably customized the Ambulant SMIL player. The architecture isbased on a P2P network and collaborative aspects are related to the opportunity of offeringthe re-elaborated content as a peer. The mechanism presented in this paper has inspired theCTA module. Comparison between such a module and our proposal will be shown in thefollowing Section 4.

SMIL is on the basis of [13] too. The authors present an analysis of adaptive time-basedWeb presentation and investigate technologies available to create such kind of presentation.In this paper they suggest a mechanism, which has inspired a new SMIL 3.0 module: theSMIL State (described in the following Section 4). Such a mechanism can be used to allowusers to define and to add state to declarative time-based languages (such as SMIL). Byusing such a mechanism, the presentation author can control media tracks in a better way.In Section 4, the SMIL 3.0 State module is compared with our proposal.

The authors of [15] and [2] have taken into account NCL, a different standard devoted tomultimedia tracks for IPTV systems. A method for hypermedia document live editing isproposed in [15], with the aim to preserve presentation, logical and structure semanticsdefined by an author. An implementation of the presented solution has been done for the

Multimed Tools Appl (2011) 53:319–344 323

Page 6: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Brazilian Digital TV, which is described in the paper. Interactive TV multimedia content areon the basis of [2] too. In this paper, the authors present a prototype which supports thecapture of digital and voice comments over segments of video (allowing multimodalannotations) and produces a related document to describe media structure and synchroniza-tion. The proposed application takes into account both NCL and SMIL standards. Both theseworks (presented in [15] and [2]) are really interesting and stimulating, but also in thesecases collaborative aspects have not been taken into sufficient account, unlike our proposal.

Another interesting and fundamental aspect of our proposal is multimedia adaptation andmedia synchronization control. Hence, the rest of this Section is devoted to introduce somestandards which offer such features, in particular MPEG-21 [16], NCL [19] and SMILDAISY profile [23]. Such standards are notable examples, even if they can be improved.

MPEG-21 [16] is an open standards-based framework for multimedia delivery andconsumption. Its goal can thus be redefined as the technology needed to support users toexchange, access, consume, trade and otherwise manipulate multimedia. The Digital ItemAdaptation (DIA) is a part of this framework which specifies the tools for the adaptationof digital items. In order to achieve interoperable transparent access to (distributed)advanced multimedia content, the adaptation of digital items is required. The DIA allowsdescribing terminal capabilities (such as codec and input–output capabilities, and deviceproperties), network characteristics (such as network capabilities and network con-ditions), user information (such as usage preferences, usage history, presentationpreferences and accessibility characteristics, including visual or audio impairments) andnatural environment.

Nested Context Language (NCL) [19] is a solution for IPTV systems. NCL is a modulardeclarative language which provides functionalities for authoring a multimedia presentationwith synchronization relationships among the media which compose it. NCL modules canbe added to standard web languages, such as XHTML and SMIL, due to its XML-basedcharacteristics.

At the time authors are writing both MPEG-21 and NCL standards are not taken intoaccount in the presented system, which is mainly devoted to SMIL presentations.

A new profile introduced in SMIL 3.0 is the SMIL 3.0 DAISY one [23]. DAISY is theinternational standard for digital talking books, which are fully accessible for people withdisabilities which affect the reading of printed material, such as: blindness, low vision,hearing impairments, deaf-blindness, motor disabilities, dyslexia, and, in general, cognitive/intellectual disabilities. The SMIL 3.0 DAISY Profile has been added to the release ofSMIL 3.0 in order to provide a language which is fully-conforming to DAISY digitalbooks, to allow the complete representation of DAISY books in SMIL and to include alsomodules which are not required by DAISY. Even if such a profile is really interesting anduseful in adapting material to users with disabilities’ needs, it is not adequate to reach theaims of our proposal.

Since our proposal is mainly based on an enhancement of SMIL [28], such a standardwill be deeper detailed in the following Section 4. Section 4 will also show an example ofcomparison between our work and some recently released SMIL features.

4 SMIL and beyond SMIL

Synchronized Multimedia Integration Language (SMIL) is a XML-based markup languagewhich allows the inclusion and synchronization of heterogeneous media, so as to be shownas a continuous multimedia presentation, by means of suitable players [28]. Elements and

324 Multimed Tools Appl (2011) 53:319–344

Page 7: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

attributes describe time and space behavior of images, text, animations, video and audio.Such a grained structure allows managing multimedia presentations at the lowest detail,reaching up to each medium they present.

Our system proposes a way to describe the media which composed a compoundmultimedia, their synchronization and every cut off or addition the users made, as they werealternatives to the original medium or multimedia. The basic idea is providing furthermarkup describing a link between two media components, taking into account that:

& they can refer to the same medium source and they may have different duration;& one of them can be a set or sequence of synchronized media.

The first item of the previous list will be allowing to cut off video or audio sequences, aswell as textual contents, while the second one will be stating that it can be possiblerewriting a medium as a sequence taken from other ones or, vice versa, imploding asequence of media as a single one. By exploiting the SMIL CustomTestAttribute module[29] it is possible to link media content to their alternatives. Nevertheless such a featureimplies a very articulated implementation. Hence, in this paper we are going to present anextension of SMIL 3.0 Document Type Definition (DTD), comparing it to the alreadyexisting modules, so as to provide a mechanism binding original media and theiralternatives in a lighter way. By transcoding such an extended SMIL presentation, it willbe possible to offer the modified multimedia, based on the set of its alternatives.

First of all, we will show the SMIL 3.0 content control modules and how to extend theSMIL DTD in order to allow authors to explicitly link original media with their alternatives,comparing it with the original SMIL solution. Then we will prove that each medium mayhave an alternative described and linked to it according to the extended SMIL DTD.

Some examples will be presented to give the reader all the necessary details aboutdifferent media type components and their alternatives, as they can be described by usingthe extended SMIL.

4.1 Multimedia adaptation mechanisms

Explicit metadata for media adaptation are provided within the main standards so thatapplications and tools may apply transcoding on sequence of multimedia items. In thefollowing, we will briefly describe SMIL 3.0 features letting such a fashioning.

Besides the capabilities of specifying bandwidth, display limitations and user’s languagefor each medium, SMIL allows the declaration of author-defined variables to trigger thepresentation of contents. The latter ones are a construction of the CustomTestAttributes(CTA) module [29].

Language built-in and author-defined variables could be used by SMIL compliantplayers to present alternative multimedia items on the strength of their (Boolean) value.

The tag provides the possibility of associating alternative contents accordingto the value of “ ”, “ ”, “ ”,“ ”, “ ”, etc. [28], as well as to the value ofany variable declared on the custom test section of the SMIL document. Unfortunately,author-defined conditions are not supported by any multimedia player (at the time authorsare writing), but Ambulant 2.0 [30].

In other words, most of SMIL compliant players support changes of content, based ontechnical circumstances and few fairly static user preferences, while custom attributes arenot supported as triggers to switch the tracks played out. Hence, the use of these predefinedsystem test attributes provides a customization mechanism based on a set of fixed attributes.

Multimed Tools Appl (2011) 53:319–344 325

Page 8: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

To enhance content adaptability, the SMIL CTA module extends this feature with thedefinition of author-defined custom test attributes. This module allows the attributesdefinition and setting in the header of the SMIL document, in which the author can set adefault state for each custom test attributes and declare a default attribute which will be playout whenever no customization will be necessary.

The customization is done by using a SMIL construct in theof the SMIL document. It is used to select media for inclusion in a presentationdepending on the values of the custom test attributes. The first object that contains avalue will be rendered. It is possible to set the last option so that it will alwaysresolve , in this way it will be considered only if no other objects resolve to .The CTA module could be used in order to offer alternative presentations to users withdisabilities.

The following examples (Example 1 and Example 2) show a SMIL code fragment whichoffers a customizable presentation. Such presentation could be composed by an audio withtwo different kinds of alternatives: captions and a textual description.

In Example 1 the setting of the custom test attributes is shown. Theelement allows the definition of each attribute with a default state. In particular, threecustom test attributes are defined:

1. : resolves true whenever the user asks for captions instead of audio;2. : resolves true whenever the user ask for textual description instead of audio and

captions;3. : resolves always true. This will be the last child in the

presentation (see Example 2), in order to be play out whenever the user asks for nocustomization.

Example 1. SMIL header with custom test attributes settings.

326 Multimed Tools Appl (2011) 53:319–344

Page 9: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

The Example 2 shows a SMIL fragment in which a video is synchronized asparallel with a construct. When this SMIL presentation is played out, a SMIL3.0 compliant player evaluates the custom test declared for each branch anddecides which one it has to play on the basis of users’ preferences.

In the following paragraphs we will describe how our SMIL extension offers the samefeatures, but with an easier syntax, which has been inspired by the ACCMD DTD [11].

The same presentation shown in Example 1 and 2 is reported in the following Example 3with our SMIL extended syntax.

Example 2. SMIL <body> presentation with customTestAttributes.

Example 3. SMIL presentation with primaryResources.

Multimed Tools Appl (2011) 53:319–344 327

Page 10: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

In this example the use of a new attribute ( ) allows the declarationof alternatives for any resource (univocally identified by the attribute). If a user declaresthe need of a particular alternative then the SMIL player should be able to play out a properset of synchronized media objects. Hence, as well as the previous case, a SMIL compliantplayer has to allow users to declare their preferences. In particular, we are modifying anopen source SMIL player (X-Smiles [32]) in order to provide enhanced users’ preferencesfeatures, and to put at work our SMIL grammar extension (as we will describe in thefollowing Section 5).

Another interesting SMIL 3.0 new introduction is the SMIL State Module [28]. Thismodule provides mechanisms which permit the presentation author to create more complexcontrol media than what the timing and content control modules could do. Allowing the useof variables, a document can have some explicit state along with ways to modify, use andsave this state. In this way, such a module could be used to describe and/or define whenalternative content should be played out. An improvement is the use of XPath expressionsinstead of boolean ones, which are used in the CTA module. Our extension provides asimilar, but simpler mechanism. Moreover, as we will state in subsection D, the proper useof the attribute (which is inspired by the homonymous ACCMDelement) together with the allows to define:

& a medium as an alternative to an original medium;& a medium as an alternative to a set of original synchronized media in the multimedia

presentation;& a set of synchronized media as alternative to an original medium;& a set of synchronized media as alternative to a set of original media in the multimedia

presentation.

In conclusion, our SMIL extension attributes have been defined to be added to SMILmodules. In fact, our idea is to offer an additional and simple way to provide users withmultiple and accessible versions of the same content. From a functional point of view, CTAand State elements and our attributes offers the same results. But creating customizablepresentations by using CTA and State Modules is complex and, at the time the authors arewriting, no multimedia player supports them in a feasible way. Moreover, the lack of aplayer interface which allows users to declare their preferences makes fruitless the adoptionof such modules by multimedia presentations authors. The simplicity of our SMILextension is the added value. In the following Sections we will present a prototype whichallows authors to collaboratively create customizable multimedia presentations and users tosimply declare preferences and needs.

4.2 Media components temporal dimension

A SMIL multimedia presentation provides XML elements for inserting audio, images,video, animations and captions. Its SMIL 3.0 DTD supplies elements and attributesreferring the temporal duration of such media components [28].

In the following, we will summarize some features of SMIL timing with formaldefinitions. The SMIL presentation has got an implicit duration, based on the mediaduration it is compounded by and their synchronization. Such a characteristic issummarized in the Definition 1.

Definition 1 Given a SMIL presentation, is the SMIL duration,is the beginning of the SMIL multimedia presentation and is the end of it.

328 Multimed Tools Appl (2011) 53:319–344

Page 11: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

is defined as: where is a function allowing foreach media component duration and where is the media which composes thewhole SMIL multimedia presentation belongs to {1, 2, … , n}.

Definition 1 is a synthesis of the above mentioned SMIL Timing Module elements andattributes [30]. The media which compose a SMIL presentation, are shown (or played out)for one or more fixed intervals of time, as summarized in the Definition 2. Definition 3addresses to the rich sequence of synchronized media played out as a whole. Definition 2and 3 come, once again, from SMIL Timing Module characteristics disclosed above [30].

Definition 2 Let be one of the media durations, where is the beginningof the media, is the end it and is a subset of . For each media component , aSMIL multimedia presentation is compounded by, is:

& the play out of at the time , where belongs to , or& not shown (based on SMIL Timing and Synchronization Module Elements and

Attributes declared for inside the SMIL multimedia presentation).

Figure 1 depicts temporal dimension of a general SMIL presentation. The solidhorizontal lines (labeled with ) represent media components when they are played out.At time , is the set which contains theelements played out at time . According to Definition 2, at time , and are“not shown”.

Definition 3 is defined as: whereaddresses to the component at the time and belongs to {1, 2, … , n}. Henceis the SMIL presentation shown at the time , where belongs to .

4.3 Media components type and relations

In the following, we will detail some issues about adding media alternatives. We will showhow it is possible referring each media component to an element belonging to the set

from SMIL XMLbinding. Definition 4 states that media alternatives can be obtained from attributes of theSMIL element describing , or by transforming other element which are related to .

c1

c2c3

c4c5c6

SMILmedia

components

timetbegin tendt

Fig. 1 Example of SMIL presentation temporal diagram

Multimed Tools Appl (2011) 53:319–344 329

Page 12: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Definition 4 For each media component , a SMIL multimedia presentation is compoundedby, and for each belonging to , is said to have an alternative, if and only if: thefunction exists and it is a (in case,compound) media whose type belongs to (i.e. the set of all the subsets of ).

Now we will show that it is possible to add one or more media which represents analternative to any existing one, as chunks of SMIL code. In order to refer such elements totheir primary ones, an extension of the SMIL grammar is needed. In fact, SMIL 3.0 DTDdoes not always offer a way to explicitly link a media component and its alternative [28]. Insome cases, such a correspondence is implicit and in some other ones the author is not ableto declare it. In the following, we are going to describe some of explicit and implicitcorrespondence between different kind of media component (described by their alternativerepresentation ).

4.3.1 Explicit references to textual alternatives

On SMIL, textual alternatives of a media can be explicitly declared by defining theand/or the attributes [27]. The example 4 shows how their textual alternativesare associated as attribute of an image element. Analogously text content can be referred toa video or audio media.

Example 4. Textual alternatives to images on SMIL.

4.3.2 Implicit references to alternative media

Captions are textual alternatives to audio with speeches; they may be added to thepresentation by using element, with a non-explicit correspondence(Example 5). Moreover, it has to be taken into account that the textual alternative to anaudio could be a set of elements. Furthermore, the caption inserted in apresentation by using a single element could be the textual alternative to aset of audio elements, which could be synchronized by using a element as a container,for instance.

Audio descriptions are an equivalent alternative to video. According to well-knowndefinitions [31] an auditory description is a recorded or synthesized voice that describes keyvisual elements of the video presentation including information about actions, body

330 Multimed Tools Appl (2011) 53:319–344

Page 13: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

language, scene changes, etc. They have to be synchronized with the original video streamthey describe and with other potential audio streams.

Example 5. Captions, alternatives to audio on SMIL.

Usually audio descriptions are timed to play during natural pauses in dialog, but suchpauses may be not enough to accommodate a proper audio description. In such cases, it willbe necessary to pause the video in order to provide an adequate time for an extendedauditory description. At its end, the video should resume play automatically. Audiodescriptions may be added to the presentation by using element, with a non-explicitcorrespondence with the original video: the current video is paused and audio descriptionsof the current scene are played and the video resumes when the audio description completes(Example 6).

Example 6. Audio descriptions, alternatives to video on SMIL.

4.3.3 Not linkable alternatives

Let us consider a link between two media, or between a media and a sequence of media, asthe couple (media type 1, media type 2), where the first item is the original media type andthe second item is its alternative media type. On SMIL, none of the following links arepossible to be explicitly defined: (video, video); (audio, video); (audio, audio); (video,

Multimed Tools Appl (2011) 53:319–344 331

Page 14: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

image); (audio, image); (image, video); (image, audio); (image, image); (text, text); (text,image); (text, audio); (text, video). Trivially, such a limitation can be overcome through thedeclaration of custom test variables, (e.g., “ ” to switch fromimage to image, or “ ” to switch from text tovideo).

4.4 attribute

Besides using CTA module in providing a mechanism to associate alternatives between amedia (or a set of media) and its alternative (or set of alternatives), we have extended theSMIL DTD by adding a proper linking mechanism.

This approach allows a lighter re-arrangement of SMIL code than the previous one, andit permits to create different presentations by using simple attributes.

A feasible way could be specify the identifier attribute for the original mediumcomponent and the adoption of a new attribute which should be declared foralternatives. The value of such a new attribute should be a space-separated list of idsmedia. On the DTD syntax, such type of value is known as , which is a space-separated list of tokens. The following Code 1 shows a fragment of the extended SMIL3.0 Common Attributes DTD, with the adoption of the newattribute. With such an extension we can provide a correspondence between thealternative version of an original media and a suitable structured set of media componentson SMIL.

Code 1. Extended SMIL DTD.

By using the above defined attribute , the case of non-explicitcorrespondence between the original medium and its alternative (shown in the previousSubsection “Implicit references to an alternative”) becomes as shown in Example 7. Asthe (video, video) case of non linkable media components listed above, they can be boundas shown in Example 8, which is analogously applicable to the other cases of (audio,audio), (audio, video), etc.

The new attribute is inspired by the homonymous ACCMDelement. The IMS ACCMD contains the URL of an originalresource which composes a didactical material, while the proposedattribute admits as its value a set of media identifiers (with at least one ). In a multimediapresentation, by using attributes for original media and foralternative media it is possible to define a medium as an alternative to an originalmedium or to a set of original synchronized multimedia items in the multimediapresentation and a set of synchronized media as alternative to an original medium or toa set of original media in the multimedia presentation.

332 Multimed Tools Appl (2011) 53:319–344

Page 15: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Example 7. Alternative caption to audio on SMIL with extended DTD.

Example 8. Alternative audio descriptions to video on SMIL with extended DTD.

4.5 Versioning multimedia sets

In order to identify every produced version of an original sequence of synchronizedmultimedia items, CTA can be used so as every custom test attribute is associated to aversion and the switch phase is triggered by their values. Alternatively, another attribute isnecessary to be allowed on SMIL grammar. It has to classify the set of cuts and additions asparts of a particular plot. We called it and it has a task which isanalogous to the existing attribute. The new attribute, once again, is clearly shorterand more meaningful than the CTA approach.

Multimed Tools Appl (2011) 53:319–344 333

Page 16: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Example 9. attribute usage

Example 9 shows how some alternatives can be referred to the same new version of anoriginal compound multimedia. In particular, there are two alternatives for the originalvideo: the first one is composed of two videos in sequence, while the second one is adifferent video. Let us notice that the attribute is inherited by theelements inside a container (as it happens for the element on the Example 9).

Similar mechanisms are offered by the new SMIL 3.0 attribute and by the use ofmetadata, which can be defined within the document instead of in the head element of thepresentation, thanks to the improved SMIL 3.0 Metainformation module. Unfortunately the

attribute specifies the name of a SMIL file, which contains a description of the element.If it is selected, then the original presentation will be paused and a new document instance will becreated to display the target SMIL file. The use of this attribute affects the original mediasynchronization and the provision of new versions of the multimedia presentation.

The new SMIL 3.0 Metainformation module permits the declaration of the elementas a child of media elements inside the and not only in the header

334 Multimed Tools Appl (2011) 53:319–344

Page 17: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

of the SMIL document. Thanks to this improvement, an alternative use ofcould be done, by declaring it as a new metadata.

5 The movie set

In order to support a community to collaboratively edit extended SMIL multimedia, a Webcentered platform has been issued by enhancing some results in the We-LCoME project [6].

Figure 2 sums up its architecture: through the client application (which is a commonWeb browser), a Wiki platform allows collaborative editing on SMIL documents; theadaptation system provides edited versions by taking them from a repository and by passingthem to the enhanced X-Smiles player.

Besides a suitable description, as we obtained with our extended SMIL grammar, re-arrangement of compound multimedia needs to be supported in terms of a usable interfaceso as to let a community to collaboratively edit and synchronize them and enjoying theresults. Management of SMIL documents can be a very complex and awkward activity,closer to any insider’s specific skills rather than to the actual trends of user friendly contentmanagement systems, and very far from common prosumers’ abilities. In order to surmountsuch a limit, the Wiki platform We-LCoME [6] has been exploited. We-LCoME wasinitially developed to let collaborative noticing on SMIL video lectures.

SMIL documents are transformed into a simpler representation, where elements of acompound multimedia can be punctually recognized and edited by users. A suitable enginerebuilds the SMIL documents so as to show the results of rewriting processes. The syntaxavailable in wiki systems—the so-called wikitext—typically exploits plain text with a fewsimple conventions so as to mark up edited contents [6]. The wiki-engine, whichautomatically converts the wikitext into a final HTML document [4], has been extended totranscodify SMIL documents. The and attri-butes are automatically added by the system to the chosen media whenever users havemade changes. Then, these attributes are used by the system to drive the transcoding ofaddictions and cuts, as well as, to allow new versions to be played out. Readers can findfurther detail about the wiki interface of We-LCoME on [6].

Figure 3 depicts a screenshot of the We-LCoME editing interface. In particular, the wiki-syntax shown in this Figure corresponds to the synchronization of a Surf’s Up videosequence with a textual description of the video content. In fact, by the means theWe-LCoME interface, users can add new media to the whole multimedia presentation.Moreover they can define media synchronization by using the a proper syntactic sugar,

Client SMIL Player

Content RepositoryWeb

Wiki

Fig. 2 The whole system architecture

Multimed Tools Appl (2011) 53:319–344 335

Page 18: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

which has been enhanced in order to describe temporal constrains and relationship amongthe media. Finally, users can annotate the multimedia by typing textual comments and theycan decide if such notes have to be shown or not in the final multimedia version. Let usnote that, in this way, directors/users could manage medium and set of synchronizedmultimedia by using a simple wiki-syntax, without directly editing SMIL code andmaintaining multimedia documents consistency and validity. Such a wiki offers also aversioning system, so as to track different versions made by different directors/users.Further detail about the enhanced wiki-syntax can be found in [6].

Whenever a user requires a version of a multimedia, the correspondent extended SMILcode is passed to the enhanced X-Smiles, which plays-out such a presentation. The X-Smiles player has been modified in order to support the and the

attributes. Moreover, we have improved X-Smiles personalizationfeature: now users can define the type of media their prefer, so that to meet also userswith disabilities needs.

Figure 4 shows a screenshot of the “X-Smiles Media Preferences” interface we haveadded to our X-Smiles prototype. In particular, we have added a new choice in the Editmenu: it allows users to access the “X-Smiles Media Preferences” option and to declarewhich alternatives to original media they need. For instance, by the means of such a newfunctionality we have added, a user could declare he/she prefers audio alternatives to visualmedia; whenever audio is not available, such a user could indicate that images and videosshould be replace with textual description.

Finally, we are still working on it in order to improve its conformance to some SMIL 3.0new modules.

Fig. 3 A screenshot of the wiki interface

336 Multimed Tools Appl (2011) 53:319–344

Page 19: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

6 The directors’ cut

As stated in the previous section, our model presents a new version of an extended SMILcompliant multimedia, as an extension of the original one together with a set of alternativesarranged in classes. The latter ones may represent additions to the original media or cuts ofthem. While the duration of a new media component can be discretionally chosen by theauthor, we can assume that the directors’ cut alternatives have to refer their time length as a( , ) or ( , ) couple, with values that are inside the duration ofthe original media they refer to. This constraint can be automatically verified by a systemwhich extracts the currently edited version and allows to play it out. Such a kind of systemwould allow supporting a community to collaboratively edit compound multimedia and/orto enjoy the customized version. Management of SMIL documents can be a very complexand awkward activity, closer to any insider’s specific skill rather than to user friendlycontent management systems, and very far from common prosumers’ abilities [6]. As statedabove, an open source SMIL player (X-Smiles [32]) has been modified in order to allowusers’ preferences definition. Such a player is already able to decide which customizedmultimedia playing-out (on the basis of users’ profiles). The proposed extended grammarallows directors/users to synchronize multimedia, adding or cutting resources and definingnew attributes values. In particular, the attribute allows multiple choicesof media, as they were different versions, according to directors/users process of content.

As an example, let us consider the first scenario described in Section 2. A user wants toadd an audio (for instance, the speech of the director who explains the making of aparticular set of scenes) to the original multimedia (Fig. 5). Such a multimedia is composedof two synchronous media with the same temporal dimensions: a video content showing themovie “Surf’s Up” and an audio content reporting the characters’ talk. This will enrich the

Fig. 4 A screenshot of the X-Smiles Media Preferences interface

Multimed Tools Appl (2011) 53:319–344 337

Page 20: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

multimedia presentation, by offering a new version. The added audio track has to beproperly synchronized with the other media and provided by the suitable value of the

attribute. This is a new version of the original multimedia: the samevideo track could be synchronized with this new audio, which is added to the original one(with characters’ talk). In particular, a proper synchronization has to take into accounttemporal constrains of the involved media: the two audio tracks cannot be played outcontemporaneously, but when the new audio track is active the other one has to be paused,as well as the video track (in order to avoid de-synchronization between the video andcharacters’ talk). Users/audience could choose this version to enjoy different and enhancedcontent.

In the next Section, besides this scenario, we will consider the collaborative editing as ameans to enhance multimedia accessibility.

7 The extended SMIL accessibility

Accessibility is a particular case in which directors’ cuts and multiple versions could beuseful to provide the same content to users’ with disabilities. Users/directors could add andmanage media in a proper way and then the final compound multimedia could meet users’needs, as we have already stated in the second scenario described in Section 2.

In the following paragraphs we are going to deal with the accessibility of a SMILcompliant multimedia as a peculiar case of media re-arrangement, whenever some mediaobjects and multimedia presentation pieces have to be necessarily transformed to let themmeet the needs of people with disabilities. Our aim is to show the necessary and sufficientconditions so as to guarantee the accessibility of our extended SMIL grammar.

According to common definitions, laws and guidelines [12, 31], the textual alternative ofa non-textual media is accessible, by covering every kind of disability. Indeed, textualalternatives are the poorest shape of a presentation; blind or deaf users might enjoypresentation richer than single textual ones [10, 31].

Time overlapping of more than one media (which could involve different senses), as ithappens with multimedia, implies the ability to respond with more than one sensorialchannel. Any strategy to obtain the same content after cutting-off some stimulus, due to theconstraint to meet a specific user’s need, degrades the original multimedia resource. Inparticular, let us consider two aspects:

a) what firstly flows as a set of contemporary media has to be collapsed on a subset ofthem, so that to involve a related subset of sensorial channels; in some cases this maycause a cognitive overload in users. Let us consider, for example, a blind user accessinga slice of Surf’s Up, composed of the video clip, its audio track and a supplementaryspeech describing the scene. Two (synchronous) audio tracks are technically available:

+ D D

Fig. 5 Adding the director’s cut

338 Multimed Tools Appl (2011) 53:319–344

Page 21: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

(i) the main one and (ii) that one describing the scene. Obviously, the user cannotbenefit from the content of the two contemporary audio tracks.

b) The process of conversion from a set of synchronized media (the original slice of themovie as a whole) to a textual version potentially involves a partial loss ofsynchronicity, eventually forcing the sequence of media, which are contemporaneousin the original version of the multimedia presentation. On the other hand, it has toguarantee at least a correlation (even if a sequential one) among alternatives of originalmedia components. The textual version of the previous example may provide asequence of the scenes descriptions and the related captions supplying the scene audio,instead of the two contemporaneous audio tracks.

By transcoding the extended SMIL presentation, it will be possible to offer the richestmultimedia the user is able to benefit, based on his/her preferences and needs. The twoaspects we have disclosed above about side-effects of multimedia degradation will be takeninto account to state proper strategies.

Now we want to show which are the sufficient conditions for the textual equivalent of aSMIL compliant multimedia on an instant belonging to . We assume each mediatype is unambiguously identified by the author, by referring the media to the proper SMILelement. In particular, we consider the author to be responsible of type description meaning.The authors do not use the element, because of its lack of meaning.

Sufficient conditions to a punctual “textualization” of a SMIL multimedia, i.e for aninstant of , are given by the existence of all the textual alternatives for the mediacomponents played out at . For each media component played out in , in fact, we canobtain (according to our extended DTD) its textual equivalent (or itself if is a textualcomponent). Furthermore, we can obtain the textual equivalent which lasts on an intervalcontaining . Such a result comes from the duration of the textual alternative (that is, fromits time attributes) and the time container whose the textual alternative is the son.

In order to achieve a textual version for an arbitrary interval of time on , letus consider some constraints on the “contours” of media time dimension. It is obvious that wecannot consider an (infinite) sequence of descriptions provided by the previous assumption.Moreover, we have to choose the textual alternatives avoiding any inaccessibility on theresulting meaning. In order to preserve such a meaning we have to establish some constrains.

In particular, we can state that, given a SMIL presentation , lasting an interval ,for each media played out on interval (subset of ), we can choose the

c1

c2c3

c4c5c6

SMILmedia

components

timetbegin tendt1 t2

Fig. 6 Example of SMIL presentation temporal diagram

Multimed Tools Appl (2011) 53:319–344 339

Page 22: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

textual equivalents as follows: firstly the textual alternative , and then anyother textual alternative with belonging to and greater thanthe end-of-duration of its previous description of . Such descriptions cover (without anydiscontinuity) the interval. Such a sequence describes on . Let us note thatwhenever a textual description of has not an explicit duration, we can always assume itlasts as . In order to give a textual description of on , it is necessary to choose aproper sequence of descriptions to avoid any distortion of meaning and to better arrangetextual version. In particular, we have to choose a sequence of intervalswhenever changes occur to the set of the played out media. Such a way, we provide ameaningful approximation of the original media and we are able to arrange textual contents.

The diagram depicted in Fig. 6 shows a typical sequence of synchronized mediacomponents on a SMIL presentation. Every component is drawn as a continuous lineinto its play out interval. Referring on the diagram, M has a textual equivalent (i.e. isaccessible) in if, for each given in , the textual equivalent of eachmedia component having a play out, exists.

As an example, let us consider the following scenario, where users with disabilitiesaccess to a movie slice, as shown in Fig. 7.

The original multimedia presentation is composed of:

i) a video content showing the movie “Surf’s Up” ( ),ii) an audio content reporting the characters’ talk ( ).

In order to meet deaf users’ needs, it is necessary to add a caption sequence as a textualalternative to audio tracks, while the latter ones can be not played out. It is worth noting,captions have to be synchronized with the audio and the other media the video is compoundby (see Fig. 8). Hence, the final multimedia presentation is compound by a video track anda caption sequence, which is the textual alternative content to the audio track. Such aversion of the presentation is accessible to users with hearing disabilities.

By taking into account a user with visual impairments, we have to consider he/she gainsaccess to the content with a PC equipped with a screen reader and/or a Braille display (i.e.,the assistive technologies that enable blind people to use a computer).

=

=

When I have seen Big Z

Fig. 8 Original media and final multimedia

c1c2

SMIL

media

components

time

tbegin tend

Fig. 7 Example of SMIL presentation temporal diagram

340 Multimed Tools Appl (2011) 53:319–344

Page 23: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Due to the user impairments, only audio tracks can be utilized along the movie. Thus, alldetailed visual information may be omitted and substituted, whenever possible, withauditory descriptions (see Fig. 9) or alternative text. In this case, the final multimediapresentation is compound by two audio: the original audio track and a sequence of audiodescriptions, as auditory alternative to the video track. Such a version of the multimediapresentation is accessible to blind users.

Figure 10 shows the final multimedia presentation, which is composed of the originalaudio ( ) and video ( ) and by their alternatives:

i) a sequence of audio descriptions ( , and ) as an auditory alternative to the videoand

ii) a caption sequence ( ) as a textual alternative to the audio track.

8 Conclusions and future works

Our approach aims to be a solution to make multimedia contents more manageable by aWeb community, so as to surmount the closeness limits they could present on the Web. Theadvantages of SMIL, being a standard and being able to describe on a grainy waycompound multimedia, have been exploited to provide a suitable extension which takes intoaccount editing and synchronizing activities. A wiki prototype has been implemented, inorder to provide a system to edit and synchronize multimedia through a user-friendlyinterface. In this ways, directors/users could manage medium and sequence of multimediaitems by using a simple wiki-syntax, without directly editing SMIL code and maintainingmultimedia documents consistency and validity. Such a wiki offers also a versioningsystem, so as to track different versions made by different directors/users. Two differentscenarios which show the capabilities to add customizable and accessible versions ofmultimedia contents have been detailed, in order to describe the feasibility of our system.

c1c2

SMILmedia

components

timetbegin tend

c6

c3 c4 c5

Fig. 10 Example of SMIL presentation temporal diagram

=

=

Fig. 9 Original media and final presentation

Multimed Tools Appl (2011) 53:319–344 341

Page 24: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

At the moment we are still assessing the effectiveness and the simplicity of the system withuser tests. A last, but not least, consideration is about copyright issues. In a collaborativesystem it is hard to define the paternity of a work. Moreover, in our system, a director/usercould add new media and change the synchronization of the media which compose thewhole multimedia presentation. This creates a new version of the multimedia, openingquestions about the actual paternity of such a new multimedia presentation. At the momentcopyright issue remains an open question.

References

1. Bulterman DCA et al (2006) An architecture for viewer-side enrichment of TV content, presented at the14th annual ACM international conference on Multimedia (MM’06), Santa Barbara, CA,USA, pp 651–654, Oct

2. Cattelan RG et al (2008) Watch-and-comment as a paradigm toward ubiquitous interactive video editing.ACM Transactions on Multimedia Computing, Communications and Applications 4(4), Article 28, Oct

3. Cesar P et al (2008) Enhancing social sharing of videos: fragment, annotate, enrich, and share, presentedat the 16th ACM international conference on Multimedia (MM’08), Vancouver, British Columbia,Canada, pp 11–20, Oct

4. Désilets A et al (2006) Translation the Wiki way, presented at the 2006 ACM International Symposiumon Wikis (WikiSym 2006), Odense, Denmark, August 21–23

5. Facebook (2009) Available: http://www.facebook.com/6. Ferretti S et al (2008) E-learning 2.0: you are We-LCoME!, presented at the 2008 W4A International

Cross-Disciplinary Conference on Web Accessibility, Beijing, China, April 21–227. Foll S et al (2006) Classifying multimedia resources using social relationships, presented at the Eighth

IEEE International Symposium on Multimedia (ISM’06), San Diego, CA, USA, Dec 11–138. GoogleVideo (2009) Available: http://video.google.com/9. Hossain MA et al (2006) MeTaMaF: Metadata tagging and mapping framework for managing

multimedia content, presented at the Eighth IEEE International Symposium on Multimedia (ISM’06),San Diego, CA, USA, Dec 11–13

10. IMS Global Learning Consortium (2006) Guidelines for developing accessible learning applications.Available: http://www.imsglobal.org/accessibility/

11. IMS Global Learning Consortium, IMS AccessForAll Meta-Data (ACCMD) (2004) Available from:http://www.imsglobal.org/specificationdownload.cfm

12. Italian parliament (2004) Law nr. 4–01/09/2004. Official Journal nr. 13–01/17/2004, Jan13. Jansens J et al (2008) Enabling adaptive time-based web applications with SMIL state, presented at the

8th ACM symposium on Document engineering (DocEng’08), São Paulo, Brazil, pp 18–27, Sep14. Kosch H et al (2005) The life cycle of multimedia metadata. IEEE Multimed 12(1):80–8615. Monteiro de Resende Costa R et al (2006) Live editing of hypermedia documents, presented at the 6th

ACM symposium on Document engineering (DocEng’06), Amsterdam, The Netherlands, pp 165–172, Oct16. MPEG Requirements Group (2002) MPEG-21 Overview”. ISO/MPEG N499117. MySpace (2009) Available: http://www.myspace.com/18. Nack F et al (2004) That obscure object of desire: multimedia metadata on the web, Part 2. IEEE

Multimed 12(1):54–6319. Nested Context Language (NCL) (2009) Available: http://www.ncl.org.br/index_en.php20. Pea R et al (2004) The diver project: interactive digital video repurposing. IEEE Multimed 11(1):54–6121. Sgouros NM et al (2007) Towards open source authoring and presentation of multimedia content,

presented at the International workshop on Human-centered multimedia (HCM’07), Augsburg, Bavaria,Germany, pp 41–46, Sep

22. Shaw R et al (2006) Community annotation and remix: a research platform and pilot deployment,presented at the 1st ACM international workshop on Human-centered multimedia (HCM’06), SantaBarbara, CA, USA, pp. 89–98, Oct

342 Multimed Tools Appl (2011) 53:319–344

Page 25: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

23. SMIL 3.0 Daisy Profile Module (2009) Available: http://www.w3.org/TR/SMIL/smil-daisy-profile.html24. Tang CW (2007) Spatiotemporal visual considerations for video coding. IEEE Trans Multimedia 7

(2):231–23825. The Ambulant Open Source SMIL player (2009) Available: http://www.ambulantplayer.org/26. Van Ossenbruggen J et al (2004) That obscure object of desire: multimedia metadata on the web, Part 1.

IEEE Multimed 11(4):38–4827. World Wide Web Consortium (1999) Accessibility features of SMIL. Available: http://www.w3.org/TR/

SMIL-access/, Sep28. World Wide Web Consortium (2008) Synchronized multimedia integration language 3.0. Available:

http://www.w3.org/TR/2008/REC-SMIL3-20081201/, Dec29. World Wide Web Consortium (2008) The SMIL 3.0 CustomTestAttributes Module. Available: http://

www.w3.org/TR/2008/REC-SMIL3-20081201/smil-content.html#ContentControlNS-UserGroups/, Dec30. World Wide Web Consortium (2008) SMIL 2.1 Timing and Synchronization Module. Available: http://

www.w3.org/TR/2008/REC-SMIL3-20081201/smil-timing.html, (Dec 01)31. World Wide Web Consortium (2008) Web content accessibility guidelines 2.0. Available: http://www.w3.

org/TR/WCAG20/, Dec32. X-smiles (2009) Available: http://www.x-smiles.org/33. YouTube (2009) Available: http://www.youtube.com/

Silvia Mirri is an assistant professor at the Department of Computer Science of the University of Bologna.Her research interests include: multimodal interaction, accessibility and e-learning.

Ludovico A. Muratori is a researcher associate at the Department of Computer Science of the University ofBologna. His current research interests include Web accessibility and multimedia systems for the Web.

Multimed Tools Appl (2011) 53:319–344 343

Page 26: The Directors cut: a solution to collaborative multimedia …mirri/paper/thedirectorscut.pdf · 2012. 7. 12. · collaborative editing of multimedia resources synchronization, by

Marco Roccetti is a professor of Computer Science at the University of Bologna. His research interestsinclude digital audio and video for multimedia communications, wireless and consumer multimedia,computer based entertainment.

Paola Salomoni is an associate professor of Computer Science at the Department of Computer Science ofthe University of Bologna. Her research interests include the following: distributed multimedia systems,wireless multimedia, e-learning, accessibility.

344 Multimed Tools Appl (2011) 53:319–344