[ieee 2010 ieee second international workshop on software aging and rejuvenation (wosar) - san jose,...

Resilient Hypermedia Presentations

Marcio Ferreira Moreno Luiz Fernando Gomes Soares

Department of Informatics Pontifical Catholic University of Rio de Janeiro (PUC-Rio)

Rio de Janeiro/RJ, Brazil {mfmoreno, lfgs}@inf.puc-rio.br

Abstract—This paper proposes a recovery plan for Ginga-NCL, the declarative middleware environment of the Japan-Brazilian Digital TV Standard and ITU-T Recommendation for IPTV services. The proposed plan aims at providing resilience to digital TVpresentations. As proof of concept, the recovery plan has been incorporated to the Ginga-NCL reference implementation. However, it can also be applied to other DTV middlewares.

Keywords-component; Ginga-NCL; DTV; proactive recovery; reactive recovery; middleware rejuvenation

I. INTRODUCTION Interactive digital TV (DTV) applications are a special

kind of hypermedia applications in which several media objects of different content types, including the main audiovisual stream among them, are synchronized in time and space, making up a sequence of scenes that must be presented in the receiver side. In order to give support to these applications and to allow them to be independent from platforms, receivers provide a software layer, called middleware, to give access to their hardware and operating system resources.

Reliability is one of the main requirements of middleware design, which commonly considers the use of various third party libraries to perform specific tasks, such as decoding and rendering content of different types, handling input events, etc. [1]. Third party libraries can cause many reliability problems since the middleware is exposed to potential problems coming from them, such as memory leaks, unexpected behavior or even critical faults that, without prevention, can cause unrecoverable errors.

Tools and techniques designed to reduce the number of faults are not sufficient to ensure reliability in software systems [2]. It is necessary to build systems that recognize faults and incorporate techniques to tolerate them, or even recover from them, while still providing an acceptable service level.

In all data structures necessary to scheduling hypermedia presentations, as those defined by Costa [3] for the declarative environment of the Ginga middleware (called Ginga-NCL), no worry about supporting fault recovery is ever mentioned. Aiming at covering this gap, this paper proposes a recovery plan for Ginga-NCL, the standard declarative environment of ISDB-T (International Standard for Digital Broadcasting) [4] DTV system and ITU-T

recommendation for IPTV services [5]. The proposal aims at making resilient not only the Ginga-NCL environment itself but also the running DTV applications. As proof of concept, the recovery plan has been incorporated in the Ginga-NCL reference implementation [4] [5].

The paper continues along the following organization. Section 2 overviews some related work. Section 3 presents the Ginga-NCL architecture, including the proposed recovery plan. Section 4 discusses the integration of this plan in the Ginga-NCL reference implementation. Finally, Section 5 is dedicated to the conclusions.

II. RELATED WORK To the best of our knowledge, this is the first work that

reports the use of fault tolerance techniques to make DTV presentations resilient as well the DTV middleware that gives them support. However, several work in the literature reports resilient solutions in other domains, or even in general scope. Some of them guided our approach.

Huang et al. [6] propose a proactive recovery mechanism based on software rejuvenation, assuming that the initial state of a system is the more correct and consistent one in its entire lifecycle. Most work on software rejuvenation has focused on specifying policies to increase system availability and to reduce rejuvenation cost [7] [9].

Proactive recovery techniques are also used in reliable distributed systems [9] [10]. Castro and Liskov [9] describe a proactive recovery system focusing on Byzantine-fault tolerance. In this system, server replicas are periodically rejuvenated to eliminate the effects of malicious attacks and system faults. Sousa et al. [10] propose a complementary approach, which combines the proactive recovery techniques with services that allow for replicas in normal state to act reactively, in case of fault detection, in order to recover impaired replicas.

The solution proposed by Fugini and Mussi [11] also combines proactive and reactive recovery techniques. They present a resilient architecture for Web services provisioning, using mechanisms to detecting and recovering from faults. An interesting point in that work is considering input data inconsistencies (data entered by a Web service user application) also as faults. The defined fault recovery mechanisms focus on web service replacements and on the recovery of input data quality.

Table 1 presents a comparison between the solution proposed in this paper and the previously mentioned related

978-1-61284-346-9/11/$26.00 ©2011 IEEE

work that have influenced our approach. Like Sousa et al. [10] and Fugini and Mussi [11] solutions, in this paper the combination of proactive and reactive recovery techniques are pursued, defining not only a rejuvenation methodology for parts of the system but also mechanisms for reactive fault recovering. As aforementioned, reactive recovery techniques are important due to possible faults in third part libraries, which could break hypermedia document presentations.

TABLE I. COMPARISON WITH RELATED WORK

Proactive Rec.

Reactive Rec.

Resilient Applications

Media Synch

Huang et al. � Castro and Liskov � Sousa et al. � � Fugini and Mussi � � Ginga-NCL � � � �

To provide resilient hypermedia document presentations, the recovery plan must be able to track the presentation consistency, similar to Fugini and Mussi [11] proposal, when monitoring the consistency of data entry applications. However, the Fugini and Mussi [11] solution aims at resilient services used by applications, unlike this paper solution, which must control the system aging and react to inconsistencies in hypermedia document presentations to make them resilient.

Unlike the other systems in Table 1, the work proposed in this paper is able to provide resilience both to the system and to applications using the system. To provide resilient hypermedia presentations, it is essential understanding the media synchronization mechanisms, since spatial and temporal relationships defined by the application author must be respected.

III. ARCHITECTURE Figure 1 shows the modular architecture of the Ginga

middleware, divided in its two logical subsystems: the Ginga-NCL presentation environment and the Ginga Common Core (Ginga-CC). Ginga-CC is responsible for providing basic services, related to the receiver platform, to Ginga-NCL presentation environment; to resident applications, and to optional imperative environments. Ginga-NCL presentation environment is the logical subsystem that can initiate and control NCL applications.

In this paper, a module of Ginga consists of a set of software components that together provide a specific functionality. The software components are defined as in Szyperski [12]: composite units with contractually specified interfaces and with explicit context dependencies, which can be independently deployed or grouped by third parties.

Ginga-NCL Presentation Environment

Ginga-CC Adapters

Layout Manager Recovery ManagerPrivate Base

Manager

Formatter

Transport

I/O Manager

TunerGraphic Manager

Din. Evolution M.

Data ProcessingPlayers

Lua Engine

Abstraction Layer

Libraries, Operation System and Hardware Drivers

Player ManagerPres. Scheduler

NCL Context Manager

Device Manager Context Manager

XML ParserConverter

Figure 1. Ginga Architecture.

A. Ginga-CC In Ginga-CC, the Tuner module is responsible for

receiving broadcasted DTV content. DTV applications can come multiplexed in a stream received by this module, or else by another network interface. In the first case, applications are extracted from the received stream by the Data Processing module. In the second case, a Transport module controls protocols and network interfaces in order to be able to receive application specifications or applications’ content.

The I/O Manager module is in charge of managing temporary storage of applications, including their media content.

The Players module provides decoders and renderers for presenting each specific content type. In particular, a Lua engine, a Lua player component, is part of Ginga-CC. Lua [13] is the scripting language of NCL.

The Graphic Manager module supports the spatial control of content rendering, including the main audiovisual DTV stream.

The functionalities provided by Ginga modules can be updated, independently. Updates can be received as broadcast pushed data or obtained from repositories by using the Transport module. The Dynamic Evolution Manager module is responsible of performing the update procedure in real time, that is, without interrupting the middleware execution.

The Device Manager module controls multiple exhibition devices in a distributed presentation. Domain management, device registrations, communication between devices, and synchronism consistency are among its functions.

Finally, the Context Manager module is in charge of gathering platform and viewer profiles, feeding and controlling a context data base.

B. Ginga-NCL Presentation Environment The core of Ginga-NCL Presentation Environment is the

Formatter module, shown in Figure 1. This module is responsible for receiving and controlling NCL applications.

Upon receiving an application, the Formatter requests the XML Parser and Converter modules to translate the

application to the Ginga-NCL internal data structures, which compose a Private Base.

From then on, the Presentation Scheduler module is requested to orchestrate the NCL presentation. This module is responsible for commanding the Player Manager module to instantiate specific Players, according to the media content type to be exhibited in a given moment in time. Whenever a content presentation finishes, the Presentation Scheduler is notified by the corresponding Player. If the content presentation is no longer required, the Presentation Scheduler instructs the Player Manager module to kill the corresponding Player instance.

A generic API is defined to establish the communication between Player components and the Presentation Environment (Presentation Scheduler module). Thanks to this API, the Ginga-NCL Presentation Environment and Ginga-CC are strongly coupled but independent subsystems. For example, Ginga-CC for terrestrial DTV can be replaced by another third part implementation that supports IPTV, allowing Ginga-NCL to be used as an IPTV middleware or as an extension to existent IPTV middlewares.

Players that do not follow the generic API must use services provided by Adapter components. Any user agent or execution engine may be adapted as a Ginga-NCL Player, as for example, XHTML browsers.

A Private Base Manager module is in charge of receiving NCL editing commands and maintaining NCL documents being presented. The set of NCL live editing commands [4] are divided in three subsets.

The first one focuses on the private base activation and deactivation (openBase, activateBase, deactivateBase, saveBase, and closeBase commands). In an open private base, NCL applications can be started, paused, resumed, stopped and removed, through well defined editing commands that compose the second subset. In Ginga, a DTV application can be created or modified on the fly, using NCL editing commands. The third subset defines commands for these purposes, allowing NCL elements to be added and removed, and allowing values to be set to NCL elements’ attributes.

The Layout Manager module is responsible for mapping all media object placements, defined by an NCL application, to canvas on specific exhibition devices that compose a distributed exhibition platform, supported by Ginga-CC.

The NCL Context Manager module supports content and content presentation adaptations, according to information provided by the Ginga-CC (its Context Manager module) and directives provided by NCL applications.

The Recovery Manager module implements fault recovery procedures and is discussed in the next section.

C. Recovery Plan Aiming at defining target sets for recovery techniques,

the Ginga modules are classified in: Risk; Presentation; Control; and Recovery modules.

Risk modules are those that can impair the middleware reliability. Usually, this can happen due to using third-party libraries. The following Ginga modules make up this set: Players (including Lua Engine), Graphic Manager, Context

Manager, Tuner, Transport, I/O Manager, Data Processing, Device Manager and XML Parser.

Presentation modules are those that control Risk modules and are directly involved with the content presentation. Adapters and Layout Manager modules are part of this set.

Control modules are those that control Presentation modules or Risk modules. The following modules are part of this set: Dynamic Evolution Manager, Private Base Manager, Converters, Player Manager, NCL Context Manager and Presentation Scheduler.

Finally, Recovery modules are those responsible for fault detections, control and recovery. The Recovery Manager is the single module in this set. Indeed, it is a super-module that is internally divided into several other components.

Figure 2 illustrates how Recovery modules act on the other sets to create a fault recovery plan. In what follows, the architecture presented in Figure 2 is divided according to the recovery techniques: Proactive and Reactive.

Presentation RiskControl

Recovery

Proactive

Reactive

Monitor

Fault Ident.

Register

Validation

Recovery M.

Action Ident.

Policy

1

26

7

35

4 8

10

11

Monitor

Qualifier

Policy

a

b

c

e

Proactive RecoveryReactive Recovery

d

7

9

Figure 2. Architecture of the Recovery Plan Ginga-NCL.

Using proactive recovery, the Qualifier component starts the rejuvenation of Control modules. When this happens, Presentation and Risk modules ruled by Control modules are also indirectly rejuvenated.

In order to know when to start the rejuvenation process of each Control module, without impairing system availability, a Monitor component registers itself as a listener of each Control module behavior. When a rejuvenation operation is allowed in a Control module, the Monitor is notified (in Figure 2, arrow "a"). It then passes the information to the Qualifier component to evaluate if there is any recovery action to be performed (in Figure 2, arrow "b"). In the evaluation process, a proactive recovery policy can be taken into account. This policy is accessed via a query to the Policy component (in Figure 2, arrow "c"). For example, receivers able to display only one interactive application at a time must have as a policy that there may be only one Formatter instantiated. The Qualifier also queries the Control module state to make sure that the rejuvenation can be performed (in Figure 2, arrow "d"). Only then, the recovery action can be executed (in Figure 2, arrow "e")

In reactive recovery, a Monitor receives fault notifications (in Figure 2, arrow "1") by registering itself as a listener of Risk modules faults. Upon receiving a notification, the Monitor relays it to the Fault Identifier component (Figure 2, arrow "2"). With this information, the Fault Identifier requests the Register component to log the fault (in Figure 2, arrow "3"). The Register returns to the Fault Identifier how many times the fault has already occurred within a time interval. With this information, the Fault Identifier queries the reactive recovery Policy to evaluate if the fault should be treated (in Figure 2, arrow "4"). For example, a receiver can define as a policy that a particular fault must no longer be treated after a certain number of recovery attempts. If the Fault Identifier concludes that the fault must not be treated, no recovery action is performed. Otherwise, it passes the fault information to the Action Identifier component (in Figure 2, arrow "5").

The Action Identifier evaluates recovery actions to be performed. If an action must be started, the Action Identifier calls the Validation component to validate the state of the module to be recovered (in Figure 2, arrow "6"). For this sake, the Validation component queries the Control or the Presentation module to be recovered, returning the result to the Action Identifier (in Figure 2, arrow "7"). The returned state may indicate that, in spite of the fault, the module has performed its operations properly. In this case, no recovery action is performed. Otherwise, the Action Identifier notifies the Recovery Manager (in Figure 2, arrow "8") soon after it defines the recovery action and accesses the necessary information to accomplish the task.

Before performing a recovery action, the Recovery Manager queries the reactive recovery Policy, to be informed of the action requirements (in Figure 2, arrow "9"). An example of requirement is the time interval in which the recovery action is valid. With all the necessary information, the Recovery Manager performs the recovery action on the Risk module in fault (in Figure 2, arrow "10"), and notifies the Presentation module that controls that Risk module (Figure 2, arrow "11").

IV. IMPLEMENTATION The recovery plan support was included in version 0.12.1

of Ginga-NCL reference implementation1 [10] by means of two new components called Proactive Recovery Manager and Reactive Recovery Manager.

The Proactive Recovery Manager component, shown in Figure 3 as ProactiveRecoveryManager, embeds all functionalities required to implement the proactive recovery techniques. In this component, two types of monitors are implemented.

The first type monitors the behavior of the Tuner component, as shown in Figure 3, and implements the

1 The current version of the Ginga-NCL reference implementation is 0.12.1. All versions of the reference implementation are available at www.softwarepublico.gov.br.

ITunerListener interface. When instantiated, the ProactiveRecoveryManager creates this monitor and registers it as a listener of the Tuner component, through its ITuner interface. The Tuner component is accessed by ProactiveRecoveryManager through the Dynamic Evolution Manager, shown in Figure 3 as ComponentManager.

The ComponentManager component is implemented using the Singleton design pattern. Its single instance can be accessed through the IComponentManager interface.

From the moment a monitor is registered on, when a new channel is tuned, the monitor is notified. It then passes the notification to the qualifier to evaluate what action has to be taken. The qualifier is then able to rejuvenate all Ginga-CC components, if there is no Formatter instantiated, using again ComponentManager facilities, through the IComponentManager interface.

Figure 3. Component Diagram for Proactive Recovery.

The second monitor type is notified when interactive applications are received. For this sake, it inherits both the IAITListener and the IAppListener interfaces, which allow for receiving Data Processing component (shown in Figure 3 as DataProcessor) notifications and Transport component (shown in Figure 3 as TransportManager) notifications, respectively. In the constructor method of ProactiveRecoveryManager, this monitor is created and registered as a listener of the DataProcessor component through the IDataProcessor interface, and also as a listener of the TransportManager component through the ITransportManager interface. The DataProcessor and the TransportManager components are accessed by ProactiveRecoveryManager through the IComponentManager interface of the ComponentManager component.

When interactive DTV application is started, the ProactiveRecoveryManager qualifier component is notified by the monitor. The qualifier then evaluates the notification and instantiates a new Formatter to present the application.

When a new channel is tuned, the running application is interrupted and its associated Formatter can be destroyed. If a viewer switches back to the previous channel, Ginga-NCL allows the application resuming, but now supported by a rejuvenated Formatter.

The implementation of a ProactiveRecoveryManager component’s interface to allow for proactive recovery policy queries was left for future implementation versions.

To implement the Reactive Recovery Manager, the Ginga-NCL Player components had to be refactored, allowing an application presentation to be supported by multiple running processes.

To allow better control of notifications coming from these process signals and to easy code embedment in different platforms, a process abstraction is created, as illustrated by the Process component in Figure 4. This component is used to create independent Player processes (Player component in Figure 4) that, after refactoring, have to implement the IProcess interface.

To receive fault notifications from Player processes, the Recovery Reactive Manager component, shown in Figure 4 as ReactiveRecoveryManager, has a monitor, which must implement the IProcessListener interface. When a Player is created, a new monitor must be registered as its observer. For this sake, RectiveRecoveryManager has a factory that implements the monitor IPAManager interface. When instantiated, the factory registers itself as an observer of the Player Manager component (shown in Figure 4 as PlayerAdapterManager), through using the IPlayerAdapterManager interface. In order to access the PlayerAdapterManager, ComponentManager features are used.

When a Player is created by the PlayerAdapterManager, the ReactiveRecoveryManager factory receives a notification. Through the IPlayer interface, implemented by the Player, the factory can register a new fault monitor as a Player observer.

Figure 4. Component Diagram for Reactive Recovery.

When the monitor receives a fault notification, it forwards it to the Fault Identifier, which classifies the fault according to the types given by column "Fault" in Table 2.

TABLE II. TYPES OF FAULT AND CORRESPONDING ACTIONS

Fault Action

Content missing or invalid Skip relationships specified for the object

Fatal decoding or rendering Retrieve the specified state Application Specification Exception Handling

Once the fault is classified, it is registered and the reactive recovery policy is consulted. The Fault Register is implemented to univocally identify a fault and its source. The recovery policy provides information only about how many faults can occur and within which time interval. If these values are not defined, no policy validation is performed.

After consulting the recovery policy, the Fault Identifier passes the information to the Action Identifier, which defines the action in agreement with Table 2.

The content reception functions defined by DTV systems should have fault correction mechanisms [14]. The "content missing or invalid" fault (see Table 2) occurs when there are errors in the content generation performed by content providers; for example, a DTV application referring to non-existent content or to a corrupted content. The recovery action defined for this fault (see Table 2) aims at maintaining the presentation consistency [4] [5]. In this case, it is not necessary to validate the state of the object that has reported the fault, or to consult the reactive recovery policy to perform the corresponding recovery action.

To ensure that spatiotemporal relationships bound to the object are no longer considered during the presentation, the Recovery Manager uses the IFormatterScheduler interface of the FormatterSheduler component. To access this component, the IComponentManager interface of ComponentManager is used.

When during media object decoding and rendering the library responsible for these operations presents unexpected behavior, the “fatal decoding or rendering” fault must be notified. To perform the corresponding recovery action (see Table 2), it is necessary to query the FormatterScheduller and PlayerAdapterManager components in order to determine which presentation state has to be retrieved. As an example, assume an application that must display an audio object with 30 seconds duration. Assume also that the corresponding audio player sent the fault notification “fatal decoding or rendering” after 15 seconds of the presentation beginning. In this case, the state returned by querying FormatterScheduller and PlayerAdapterManager components is: “fault has occurred at 15 seconds”. As a consequence, the Recovery Manager creates a new player for the audio, sets when it should start the exhibition, notifies the recovery action, and updates the old Player references to the new one, by using the IPlayerAdapterManager interface.

When the presentation of a non conformant [4] [5] DTV application is requested, the fault “Application Specification” should be notified to the ReactiveRecoveryManager component, which should start its handling. However, in the current version of the Ginga-NCL this operation was not implemented. In the current implementation, the XML Parser component (Figure 4)

contemplates all exception handling for this kind of fault. Depending on the behavior of the library that implements the application interpretation, it can be necessary to integrate the XML Parser and the ReactiveRecoveryManager.

V. FINAL REMARKS Reliability is one of the main requirements in a

middleware design. Content presentations may not be impaired due to middleware system faults. With this focus on mind, the work proposed in this paper takes into account fault recovery mechanisms in a DTV middleware architecture and implementation.

The proposed recovery plan is able to provide resilience both to middleware systems and to applications that use these systems. As proof of concept, the proposal has been validated through the reference implementation of the ISDB-TB middleware.

Understanding spatial and temporal relationships defined by NCL applications, application lifecycles, and the Ginga-NCL architecture, the recovery plan is able to control the system aging, maintaining it as rejuvenated as possible and with higher level of availability. Furthermore, the recovery plan is able to react to inconsistencies that can happen in hypermedia document presentations.

As a future work we intend to evaluate how the proposal can be applied to XHTML-based middlewares, in particular BML [15] and LIME [16] based middlewares. Another future work is the definition of recovery policies and new recovery actions, such as using timescale algorithms to recover from synchronism loses.

REFERENCES [1] Moreno, M. F. Um Middleware Declarativo para Sistemas de

TV Digital Interativa. Master Thesis; Informatics Department, PUC-Rio, April, 2006. In portuguese.

[2] Koren, I., Krishna, C. M. Fault-tolerant Systems. Morgan Kaufmann, 2007.

[3] Costa, R., Moreno, M., Soares, L.F. DocEng, ACM Symposium on Document Engineering, “Intermedia Synchronization Management in DTV Systems”, São Paulo, Brazil, 2008.

[4] ABNT NBR 15606-2 Associação Brasileira de Normas Técnicas. Digital Terrestrial Television Standard 06: Data Codification and Transmission Specifications for Digital Broadcasting, Part 2 – GINGA-NCL: XML Application Language for Application Coding (São Paulo, SP, Brazil, November, 2007). http://www.abnt.org.br/imagens/Normalizacao_TV_Digital/ABNTNBR15606-2_2007Ing_2008.pdf.

[5] ITU-T Recommendation H.761, 2009. Nested Context Language (NCL) and Ginga-NCL for IPTV Services. Geneva, April, 2009.

[6] Huang, Y., Kintala, C., Kolettis, N., Fulton, D. Software rejuvenation: Analysis, module and applications. In International Symposium on Fault-Tolerant Computing, pages 381–390, June 1995.

[7] Garg, S., Moorsel, A. Vaidyanathan, K., Trivedi, K. A methodology for detection and estimation of software aging. In International Symposium on Software Reliability Engineering, pages 283–292, November 1998.

[8] Yujuan, B., Xiaobai, S. and Trivedi, K.S. Adaptive software rejuvenation: Degradation model and rejuvenation scheme. In International Conference on Dependable Systems and Networks, pages 241–248, June 2003.

[9] Castro, M. and Liskov, B. Practical Byzantine fault-tolerance and proactive recovery. ACM TOCS, 20(4):398–461, 2002.

[10] Sousa, P. Resilient Intrusion Tolerance Through Proactive and Reactive Recovery. The 13th Pacific Rim International Symposium on Dependable Computing. Proceedings of PRDC'07. IEEE Computer Press. Melbourne, Australia. December, 2007.

[11] Fugini, M., Mussi, E. Recovery of Faulty Web Applications through Service Discovery. 1st International Workshop on Semantic Matchmaking and Resource Retrieval: Issues and Perspectives (SMR 2006). Seoul, Korea. September 11, 2006.

[12] Szyperski, C., Gruntz, D., Murer, S., Component Software – Beyond Object-Oriented Programming. Second edition. ACM Press, 2002.

[13] Ierusalimschy R, Figueiredo LH, Celes W (2006) Lua 5.1 Reference Manual, ISBN 85-903798-3-3.

[14] Morris, S., Smith-Chaigneau, A. Interactive TV Standards: A Guide to MHP, OCAP, and JavaTV. Focal Press, 2005.

[15] ARIB STD-B24, Version 3.2, Volume 3: Data Coding and Transmission Specification for Digital Broadcasting, ARIB Standard, 2002.

[16] ITU-T Recommendation H.762, 2009. Lightweight interactive multimedia environment. Geneva, December, 2009.

[ieee 2010 ieee second international workshop on software aging and rejuvenation (wosar) - san jose,...

Documents