software engineering.pdf

248

Upload: suri

Post on 08-Sep-2015

129 views

Category:

Documents


1 download

TRANSCRIPT

  • SOFTWARE ENGINEERING AND DEVELOPMENT

    No part of this digital document may be reproduced, stored in a retrieval system or transmitted in any form orby any means. The publisher has taken reasonable care in the preparation of this digital document, but makes noexpressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. Noliability is assumed for incidental or consequential damages in connection with or arising out of informationcontained herein. This digital document is sold with the clear understanding that the publisher is not engaged inrendering legal, medical or any other professional services.

  • SOFTWARE ENGINEERING AND DEVELOPMENT

    ENRIQUE A. BELINI EDITOR

    Nova Science Publishers, Inc. New York

  • Copyright 2009 by Nova Science Publishers, Inc. All rights reserved. No part of this book may be reproduced, stored in a retrieval system or transmitted in any form or by any means: electronic, electrostatic, magnetic, tape, mechanical photocopying, recording or otherwise without the written permission of the Publisher. For permission to use material from this book please contact us: Telephone 631-231-7269; Fax 631-231-8175 Web Site: http://www.novapublishers.com

    NOTICE TO THE READER The Publisher has taken reasonable care in the preparation of this book, but makes no expressed or implied warranty of any kind and assumes no responsibility for any errors or omissions. No liability is assumed for incidental or consequential damages in connection with or arising out of information contained in this book. The Publisher shall not be liable for any special, consequential, or exemplary damages resulting, in whole or in part, from the readers use of, or reliance upon, this material. Any parts of this book based on government reports are so indicated and copyright is claimed for those parts to the extent applicable to compilations of such works. Independent verification should be sought for any data, advice or recommendations contained in this book. In addition, no responsibility is assumed by the publisher for any injury and/or damage to persons or property arising from any methods, products, instructions, ideas or otherwise contained in this publication. This publication is designed to provide accurate and authoritative information with regard to the subject matter covered herein. It is sold with the clear understanding that the Publisher is not engaged in rendering legal or any other professional services. If legal or any other expert assistance is required, the services of a competent person should be sought. FROM A DECLARATION OF PARTICIPANTS JOINTLY ADOPTED BY A COMMITTEE OF THE AMERICAN BAR ASSOCIATION AND A COMMITTEE OF PUBLISHERS. LIBRARY OF CONGRESS CATALOGING-IN-PUBLICATION DATA Software engineering and development / Enrique A. Belini. p. cm. Includes index. ISBN 978-1-61668-289-7 (E-Book) 1. Software engineering. 2. Computer software--Development. I. Belini, Enrique A. QA76.758.S64557 2009 005.1--dc22 2009014731

    Published by Nova Science Publishers, Inc. New York

  • CONTENTS

    Preface vii

    Expert Commentaries 1

    A Succinct Representation of Bit Vectors Supporting Efficient rank and select Queries

    3

    Jesper Jansson and Kunihiko Sadakane

    B Heterogeneity as a Corner Stone of Software Development in Robotics

    13

    Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo and Javier Gonzlez

    Short Communications 23

    A Embedding Domain-Specific Languages in General-Purpose Programming Languages

    25

    Zoltn dm Mann

    B Studying Knowledge Flows in Software Processes 37 Oscar M. Rodrguez-Elias, Aurora Vizcano,

    Ana I. Martnez-Garca , Jess Favela and Mario Piattini

    C Software Product Line Engineering: The Future Research Directions

    69

    Faheem Ahmed, Luiz Fernando Capretz and Muhammad Ali Babar

    D Software Development for Inverse Determination of Constitutive Model Parameters

    93

    A. Andrade-Campos, P. Pilvin, J. Simes and F. Teixeira-Dias

    E Design of Molecular Visualization Educational Software for Chemistry Learning

    125

    L.D. Antonoglou, N.D. Charistos and M.P. Sigalas

  • Contents vi

    F Software Components for Large Scale Super and Grid Computing Applications

    151

    Muthu Ramachandran

    G Principles and Practical Aspects of Educational Software Evaluation

    175

    Quynh L and Thao L

    Research and Review Studies 185

    Chapter 1 Testing Event-driven Software the Next QA Challenge? 187 Atif M. Memon

    Chapter 2 Debugging Concurrent Programs Using Metaheuristics 193 Francisco Chicano and Enrique Alba

    Index 223

  • PREFACE Software engineering is one of the most knowledge intensive jobs. Thus, having a good

    knowledge management (KM) strategy in these organizations is very important. This book examines software processes from a knowledge perspective flow, in order to identify the particular knowledge needs of such processes to then be in a better position for proposing systems or strategies to address those needs. Its possible benefits are illustrated through the results of a study in a software maintenance process within a small software organization. Furthermore, software product line architecture is regarded as one of the crucial piece of entity in software product lines. The authors of this book discuss the state of the art of software product line engineering from the perspectives of business, architecture, process and organization. In recent years, domain-specific languages have been proposed for modelling applications on a high level of abstraction. Although the usage of domain-specific languages offers clear advantages, their design is a highly complex task. This book presents a pragmatic way for designing and using domain-specific languages. Other chapters in this book examine the development of numerical methodologies for inverse determination of material constitutive model parameters, discuss some of the reasons for the irrelevancy of software engineering to the robotic community, review the evolution of robotic software over time, and propose the use of Ant Colony Optimization, a kind of metaheuristic algorithm, to find general property violations in concurrent systems using a explicit state model checker.

    In the design of succinct data structures, the main objective is to represent an object compactly while still allowing a number of fundamental operations to be performed efficiently. In Expert Commentary A, the authors consider succinct data structures for storing a bit vector B of length n. More precisely, in this setting, one needs to represent B using n+ o(n) bits so that rank and select queries can be answered in O(1) time, where for any i {1, 2, . . . , n}, rank0(B, i) is the number of 0s in the first i positions of B, select0(B, i) is the position in B of the ith 0 (assuming B contains at least i 0s), and rank1(B, i) and select1(B, i) are defined analogously. These operations are useful because bit vectors supporting rank and select queries are employed as a building block for many other more complex succinct data structures. The authors first describe two succinct indexing data structures for supporting rank and select queries on B in which B is stored explicitly together with some auxiliary information. The authors then present some matching lower bounds. Finally, the authors discuss generalizations and related open problems for supporting rank and select queries efficiently on strings over non-binary alphabets.

  • Enrique A. Belini viii

    In the last years the complexity of robotic applications has raised important problems, particularly in large and/or long-term robotic projects. Software engineering (SE) seems the obvious key for breaking that barrier, providing good maintenance and reusing, coping with exponential growth of programming effort, and integrating diverse components with guarantees. Suprisingly, SE has never been very relevant within the robotic community. In Expert Commentary B the authors briefly describe some causes for that, review the evolution of robotic software over time, and provide some insights from our most recent contributions. They have found that many problems arising from the conflicts raised by robotic complexity can be well addressed from a SE perspective as long as the focus is, at all levels, on the heterogeneity of components and methodologies. Therefore the authors propose heterogeneity as one of the corner stones of robotic software at present.

    In recent years, domain-specific languages have been proposed for modelling applications on a high level of abstraction. Although the usage of domain-specific languages offers clear advantages, their design is a highly complex task. Moreover, developing a compiler or interpreter for these languages that can fulfil the requirements of industrial application is hard. Existing tools for the generation of compilers or interpreters for domain-specific languages are still in an early stage and not yet appropriate for the usage in an industrial setting.

    Short Communication A presents a pragmatic way for designing and using domain-specific languages. In this approach, the domain-specific language is defined on the basis of a general-purpose programming language. Thus, general programming mechanisms such as arithmetics, string manipulations, basic data structures etc. are automatically available in the domain-specific language. Additionally, the designer of the domain-specific language can define further domain-specific constructs, both data types and operations. These are defined without breaching the syntax of the underlying general-purpose language. Finally, a library has to be created which provides the implementation of the necessary domain-specific data types and operations. This way, there is no need to create a compiler for the new language, because a program written in the domain-specific language can be compiled directly with a compiler for the underlying general-purpose programming language. Therefore, this approach leverages the advantages of domain-specific languages while minimizing the effort necessary for the design and implementation of such a language.

    The practical applicability of this methodology is demonstrated on a case study, in which test cases for testing electronic control units are developed. The test cases are written in a new domain-specific language, which in turn is defined on the basis of Java. The pros and cons of the presented approach are examined in detail on the basis of this case study. In particular, it is shown how the presented methodology automatically leads to a clean software architecture.

    Many authors have observed the importance of knowledge for software processes. This fact has caused that every time more researchers and practitioners initiate efforts to apply knowledge management in software processes. Unfortunately, much of such efforts are just oriented to aid big software companies, and in using existing knowledge management systems or strategies that have not been developed following the specific and particular knowledge needs of the process in which they are included. This fact has caused that often such efforts do not really help to the people who should benefit by using them. In this chapter the authors state that one way to address this problem is to first study software processes from a knowledge flow perspective, in order to identify the particular knowledge needs of such processes to then be in a better position for proposing systems or strategies to address those

  • Preface ix

    needs. Short Communication B presents an approach which has been used to accomplish the last objective. Its possible benefits are illustrated through the results of a study in a software maintenance process within a small software organization.

    The recent trend of switching from single software product development to lines of software products in the software industry has made the software product line concept viable and widely accepted methodology in the future. Some of the potential benefits of this approach include cost reduction, improvement in quality and a decrease in product development time. Many organizations that deal in wide areas of operation, from consumer electronics, telecommunications, and avionics to information technology, are using software product lines practice because it deals with effective utilization of software assets and provides numerous benefits. Software product line engineering is an inter-disciplinary concept. It spans over the dimensions of business, architecture, process and organization. The business dimension of software product lines deals with managing a strong coordination between product line engineering and the business aspects of product line. Software product line architecture is regarded as one of the crucial piece of entity in software product lines. All the resulting products share this common architecture. The organizational theories, behavior and management play critical role in the process of institutionalization of software product line engineering in an organization. The objective of Short Communication C is to discuss the state of the art of software product line engineering from the perspectives of business, architecture, organizational management and software engineering process. This work also highlights and discusses the future research directions in this area thus providing an opportunity to researchers and practitioners to better understand the future trends and requirements.

    Computer simulation software using finite element analysis (FEA) has, nowadays, reached reasonable maturity. FEA software is used in such diverse fields as structural engineering, sheet metal forming, mould industry, biomechanics, fluid dynamics, etc. This type of engineering software uses an increasingly large number of sophisticated geometrical and material models. The quality of the results relies on the input data, which are not always readily available. The aim of inverse problem software, which will be considered here, is to determine one or more of the input data relating to FEA numerical simulations. The development of numerical methodologies for inverse determination of material constitutive model parameters will be addressed in Short Communication D. Inverse problems for parameter identification involve estimating the parameters for material constitutive models, leading to more accurate results with respect to physical experiments, i.e. minimizing the difference between experimental results and simulations subject to a limited number of physical constraints. These problems can involve both hyperelastic and hypoelastic material constitutive models. The complexity of the process with which material parameters are evaluated increases with the complexity of the material model itself. In order to determine the best suited material parameter set, in the less computationally expensive way, different approaches and different optimization methods can be used. The most widespread optimization methods are the gradient-based methods, the genetic, evolutionary and nature-inspired algorithms, the immune algorithms and the methods based in neural networks and artificial intelligence. By far, the better performing methods are gradient-based but their performance is known to be highly dependent on the starting set of parameters and their results are often inconsistent. Nature-inspired techniques provide a better way to determine an optimized set of parameters (the overall minimum). Therefore, the difficulties associated to

  • Enrique A. Belini x

    choosing a starting set of parameters for this process is minor. However, these proved to be computationally more expensive than gradient-based methods. Optimization methods present advantages and disadvantages and their performance is highly dependent on the constitutive model itself. There is no unique algorithm robust enough to deal with every possible situation, but the use of sequential multiple methods can lead to the global optimum. The aim of this strategy is to take advantage of the strength of each selected algorithm. This strategy, using gradient-based methods and evolutionary algorithms, is demonstrated for an elastic-plastic model with non-linear hardening, for seven distinct hyperelastic models (Humphrey, Martins, Mooney-Rivlin, Neo-Hookean, Ogden, Veronda-Westmann and Yeoh) and for one thermoelastic-viscoplastic hypoelastic model. The performance of the described strategy is also evaluated through an analytical approach.

    An important goal for chemical education is students acquisition of key concepts and principles regarding molecular structure. Chemists have developed a rich symbolic language that helps them create and manipulate mental and external representations that describe spatial relations of aperceptual particles in order to investigate and communicate chemical concepts. High school and college students pose significant difficulties in understanding these concepts, mastering the symbolic language and making connections and transformations between symbolic, microscopic and macroscopic representations of chemical phenomena. Over the past decade the development of molecular visualization tools has changed the nature of chemistry research and created promising prospects for their integration in chemistry education which could help students overcome these difficulties. In Short Communication E the authors examine the case of molecular visualization in chemistry education and describe a number of educational packages that utilize new molecular visualization tools they developed to support learning of chemistry concepts in secondary and tertiary education.

    Software development for large and complex systems remains a costly affair. The complexity for supercomputing applications that require high speed and high precision systems grows exponentially. Short Communication F provides an approach to design and development of supercomputing applications based on software components which have potential to minimize the cost and time for complex and high dependability systems. Software components are aimed to provide a self-contained entity that can be adapted to the required environment quickly and easily. However this definition need to extended for large scale supercomputing paradigm. This will be a quite considerable paradigm shift for Component Based Software Engineering (CBSE) paradigm that exists today. The main criteria for supercomputing and grid applications include flexibility, reusability, scalability, highly concurrent, parallel & multi-threaded, security, distributed and data-intensive systems. This chapter defines a new paradigm for CBSE for supercomputing applications. Therefore design for large scale software components is the major emphasis of this chapter.

    As explained in Short Communication G, software is not just a product for all-purpose use. Generally, software is produced for a specific purpose in a domain. Some software products appeal to a wide range of users such as word processing, drawing, and editing. However, software is developed to cater for the demand of targeted users. For example, Statistical Packages for Social Science (SPSS) is a statistic analysis tool for analyzing quantitative data in research. In education, the main aim of software is to enhance teaching and learning. It is important to evaluate educational software to determine its effectiveness. There are a number of issues concerning evaluation of educational software such as users and evaluators perspectives on teaching and learning, translation theory into practice.

  • Preface xi

    A particular class of software that is fast becoming ubiquitous is event-driven software (EDS). All EDS share a common event-driven model they take sequences of events (e.g., messages, mouse-clicks) as input, change their state, and (sometimes) output an event sequence. Examples include web applications, graphical user interfaces (GUIs), network protocols, device drivers, and embedded software. Quality assurance tasks such as testing have become important for EDS since they are being used in critical applications. Numerous researchers have shown that existing testing techniques do not apply directly to EDS because of the new challenges that EDS offer. Chapter 1 lists some of these challenges and emphasizes on the need to develop new techniques (or enhance existing ones) to test EDS.

    Model Checking is a well-known and fully automatic technique for checking software properties, usually given as temporal logic formulae on the program variables. Some examples of properties are the absence of deadlocks, the absence of starvation, the fulfilment of an invariant, etc. The use of this technique is a must when developing software that controls critical systems, such as an airplane or a spacecraft. Most model checkers found in the literature use exact deterministic algorithms to check the properties. The memory required for the verification with these algorithms usually grows in an exponential way with the size of the system to verify. This fact is known as the state explosion problem and limits the size of the system that a model checker can verify. When the search for errors with a low amount of computational resources (memory and time) is a priority (for example, in the first stages of the implementation of a program), non-exhaustive algorithms using heuristic information can be used. Non-exhaustive algorithms can find errors in programs using less computational resources than exhaustive algorithms, but they cannot be used for verifying a property: when no error is found using a non-exhaustive algorithm the authors still cannot ensure that no error exists. In Chapter 2 the authors propose the use of Ant Colony Optimization, a kind of metaheuristic algorithm, to find general property violations in concurrent systems using an explicit state model checker. Metaheuristic algorithms are a well-known set of techniques used for finding near optimal solutions in NP-hard optimization problems in which exact techniques are unviable. Our proposal, called ACOhg-mc, takes also into account the structure of the property to check in order to improve the efficacy and efficiency of the search. In addition to the description of the algorithm, the authors have performed a set of experiments using the experimental model checker HSF-SPIN and a subset of models from the BEEM benchmark for explicit model checking. The results show that ACOhg-mc finds optimal or near optimal error trails in faulty concurrent systems with a reduced amount of resources, outperforming in most cases the results of algorithms that are widely used in model checking, like Nested Depth First Search or Improved Nested Depth First Search. This fact makes our proposal suitable for checking properties in large faulty concurrent programs, in which traditional techniques fail to find counterexamples because of the model size. In addition, the authors show that ACOhg-mc can also be combined with techniques for reducing the state explosion such as partial order reduction and the authors analyze the performance of this combination.

  • EXPERT COMMENTARIES

  • In: Software Engineering and DevelopmentEditor: Enrique A. Belini, pp. 3-12

    ISBN 978-1-60692-146-3c 2009 Nova Science Publishers, Inc.

    Expert Commentary A

    SUCCINCT REPRESENTATION OF BITVECTORS SUPPORTING EFFICIENT

    rank AND select QUERIESJesper Jansson1,, and Kunihiko Sadakane2,

    1Ochanomizu University, 2-1-1 Otsuka,Bunkyo-ku, Tokyo 112-8610, Japan

    2Department of Computer Science and Communication Engineering,Kyushu University, 744 Motooka, Nishi-ku, Fukuoka 819-0395, Japan

    Abstract

    In the design of succinct data structures, the main objective is to represent an objectcompactly while still allowing a number of fundamental operations to be performedefficiently. In this commentary, we consider succinct data structures for storing a bitvector B of length n. More precisely, in this setting, one needs to representB usingn+o(n) bits so that rank and select queries can be answeredin O(1) time, where forany i {1, 2, . . . , n}, rank0(B, i) is the number of 0s in thefirst i positions of B,select0(B, i) is the position in B of the ith 0 (assuming B contains at least i 0s),and rank1(B, i) and select1(B, i) are defined analogously.These operations are usefulbecause bit vectors supporting rank andselect queries are employed as a buildingblock for many other morecomplex succinct data structures. We first describe twosuccinct indexing data structures for supportingrank and select queries on B in whichB is stored explicitlytogether with some auxiliary information.We then present somematching lower bounds.Finally, we discuss generalizations and related open problemsfor supportingrank and select queries efficiently on strings over non-binaryalphabets.

    1. Introduction

    Let B {0, 1}n be a bit vector of length n. For any i {1, 2, . . . , n}, let B[i] denotethe value of B atposition i, and for any i, j {1, 2, . . . , n} with i j, letB[i..j] be the bit

    E-mail address: [email protected] by the Special Coordination Funds for Promoting Science and Technology, Japan.E-mail address: [email protected]

  • 4 Jesper Jansson and Kunihiko Sadakane

    vector consisting of B[i], B[i+1], . . . , B[j].(If i > j then B[i..j] is defined to be .)Next,define the following operations:

    rank0(B, i) Return the number of 0s in B[1..i].

    rank1(B, i) Return the number of 1s in B[1..i].

    select0(B, i) Return the position in B of the ith 0.

    select1(B, i) Return the position in B of the ith 1.

    In this commentary, we consider the problem of constructing a data structure for storingany given B such that rank0(B, i), rank1(B, i),select0(B, i), and select1(B, i) queriescan be carried out efficiently.We focus on indexing data structures for B, where B is stored-verbatim in n bits and one is allowed to use o(n) extra bitsof storage (called the index) toefficiently support rank andselect queries on B.

    We assume the word-RAM model of computation with wordlength w = log nbits1in order to handle pointers to the data structure in constant time.In the word-RAMmodel, the CPU can perform logical operations such as ANDand OR, and arithmetic op-erations such as addition, subtraction,multiplication, and division between two integers inthe interval[0, 2w 1] (w-bit integers) in constant time.The CPU can also read/write a w-bit integer from/to a specific memory cellin constant time; in other words, if B is a storedbit vector of length n,then for any given i {0, 1, . . . , n w}, B[(i + 1)..(i + w)] canbeobtained in O(1) time. The commentary is organized as follows: In Section 2., we out-line how to construct in O(n) timean index for B of size O(n log log n/ logn) = o(n)bitswhich allows each subsequent rank or select query to be answeredin O(1) time.The pre-sentation in Section 2. is based on [20] for rankand [28] for select .Next, in Section 3., westate some lower bounds from [11]and [19] which match the upper bounds given in Sec-tion 2.Then, in Section 4., we discuss generalizations to non-indexingdata structures as wellas generalizations to non-binary vectors,and finally, in Section 5., we provide examples ofotherdata structures that depend on efficient rank and select datastructures for bit vectorsand non-binary vectors,and mention some directions for further research.

    2. Upper Bounds for Indexing Data Structures

    Jacobson [15] presented a space-efficient indexing data structurefor B which allowsrank and select queries on B to be answered inO(1) and O(log n) time, respectively, whilerequiring onlyO(n log logn/ log n) bits for the index.A series of improvements to Jacob-sons data structure were made byClark [4],Munro [20],Munro et al. [23],and Raman etal. [28], reducing the time needed to answereach select query to O(1) while usingan indexof size O(n log logn/ log n) bits. Below, we describe two simplified indexing data struc-tures for Bbased on [20] for rankTo make the presentation more readable, we omit, ,, and symbolswhere obvious. Also, we allow the last block in any partition intoblocks to be smallerthan the specified block size.

    1Throughoutthis commentary, log denotes the base-2 logarithm andlog

    denotes the base- logarithm.

  • Succinct Representation of Bit Vectors Supporting Efficient rank and select Queries 52.1. An Indexing Data Structure for rank Queries (based on [20])

    Conceptually divide the bit vector B into blocks of length = log2 n each, and calleach such block a large block.Next, divide every large block into small blocks of lengths = 1

    2logn each.Create auxiliary data structures for storing the values of rank1 forthe

    boundaries of these blocks as follows:Use an integer array R[0..n/] in which every entryR[x]stores the number of 1s in B[1..x], andan integer array Rs[0..n/s] in which everyentry Rs[y] stores thenumber of 1s inB[( y/s+1)..ys],i.e., the number of 1s in the ythsmall block plus the total number of1s in all small blocks which belong to the same largeblock as smallblock y and which occur before small block y.The space needed to store RisO(n log n) = O( nlogn) bitsbecause each of its entries occupies O(log n) bits, and thespaceneeded to store Rs isO(ns log(+1)) = O(n log log nlog n )bits because all of its entries arebetween 0 and . To answer the query rank1(B, i) for any given i {1, 2, . . . , n},computex = i, y = is, and z = i ys,and use the relationrank1(B, i) = rank1(B, ys) +z

    q=1 B[ys+ q] = R[x] +Rs[y] +z

    q=1 B[ys+ q],where the first two terms are directlyavailable from R and Rs.To compute the third term in constant time, the following tablelookuptechnique can be applied:In advance, construct a table Tr[0..(2s 1), 1..s] in whicheachentry Tr[i, j] stores the number of 1s in the first j bits of thebinary representationof i.Then, whenever one needs to compute

    zq=1 B[ys + q],first read the memory cell

    storing B[(ys + 1)..(ys + s)](because s < w, this can be done in constant time),interpretthis s-bit vector as an integer p,where p {0, 1, . . . , 2s 1},and find the value Tr[p, z]in the table.Hence, rank1(B, ys + z) can be computed in constant time.The size of thetable Tr is2s s log(s+ 1)=O(

    n log n log logn) = o(n) bits,and all of the auxiliary

    data structures R, Rs, Tr may beconstructed in O(n) time. To compute rank0(B, i), noadditional data structures are necessarybecause rank0(B, i) = i rank1(B, i).Thereforewe have:Theorem 1. Given a bit vector of length n, after O(n) time preprocessing andusing anindex of size O(n log log n/ logn) bits,each rank1 and rank0 query can be answered inO(1) time.

    2.2. An Indexing Data Structure for select Queries (based on [28])Define = log2 n and construct an array storing the position ofthe (i)th occurrence of

    a 1 in B for alli = 1, 2, . . . , n .Regions in B between two consecutive positions stored in thearray arecalled upper blocks.If the length of an upper block is at least log4 n, it is sparse.Forevery sparse upper block, store the positions of all its 1s explicitlyin sorted order.Since thenumber of such blocks is at most n

    log4 n, the spacerequired for storing the positions of all 1s

    in all sparse upper blocks isat most nlog4 n

    log2 n log n = nlogn bits. For every non-sparse

    upper block U , further divide it intolower blocks of length s = 12logn each and construct

    acomplete tree for U with branching factorlog n whose leaves arein one-to-one corre-

    spondence with the lower blocks in U .The height of the tree is at most 7, i.e., at most a con-stant, because thenumber of leaves is at most log

    4 ns = 2 log

    3 n.For each non-leaf node v ofthe tree, let Cv be an array of

    log n integers such that Cv[i] equals the number of 1s inthe

    subtree rooted at the ith child of v.(All Cv-arrays can be computed in O(n) time prepro-cessing.)The entire bit vector B contains at most ns = 2nlogn lower blocks, so the total number

  • 6 Jesper Jansson and Kunihiko Sadakane

    of nodes in all trees representing allthe upper blocks is O( nlogn) and furthermore, the total

    numberof entries in all Cv-arrays is at most this much.Since the number of 1s in any treeis at most log2 n, every entryin a Cv-array can be stored in O(log logn) bits.Therefore, thetotal space needed to store all trees(including all the Cv-arrays) isO( nlogn log log n) bits. Toanswer the select1(B, i) query in constant time, first divide iby to find the upper block Uthat contains the ith 1, andcheck whether U is sparse or not.If U is sparse, the answer tothe select1 query is stored explicitlyand can be retrieved directly.If U is not sparse, startat the root of the tree that represents U anddo a search to reach the leaf that correspondsto the lower block with thejth 1, where j equals i modulo .At each step, it is easy todetermine which subtree contains the jth 1 inO(1) time by a table lookup using the Cv-array for the currentnode v, and then adjust j and continue to the next step.(For the lookup,use an (

    log n + 1)-dimensional table T such thatentry T [c1, c2, . . . , clogn, j] = x

    if and only ifthe first subtree contains exactly c1 1s, the second subtree containsexactlyc2 1s, etc. and the jth 1 belongs to the xth subtree.The space needed to store T is o(n)bits because the index of T is encoded in (

    log n + 1) 2 log logn 0.5 logn bits-

    for large enough n, so T has O(20.5 logn) = O(n0.5) entrieswhich each need log lognbits.)Finally, after reaching a leaf and identifying the corresponding lower block,findthe relative position of the jth 1 inside that lower block byconsulting a global table ofsize21/2 logn 1

    2log n log logn=O(n log n log log n)bits which stores the relative po-

    sition of the qth 1 inside a lower blockfor every possible binary string of length 12logn

    and everypossible query q in {1, 2, . . . , 12logn}. To answer select0 queries, construct data

    structures analogous to thosefor select1 described above.We obtain the following.

    Theorem 2. Given a bit vector of length n, after O(n) time preprocessing andusing anindex of size O(n log logn/ log n) bits,each select1 and select0 query can be answered inO(1) time.

    3. Lower Bounds for Indexing Data StructuresBy applying two different techniques,one consisting of a reduction from a vector

    addition problem and theother one a direct information-theoretical argument involvingreconstructingB from any given indexing data structure for B together with anappropri-ately defined binary string,Miltersen [19] proved the following theorem.(Recall that B isassumed to be stored explicitly in addition to the bitsused by the indexing data structure.)Theorem 3. [19] It holds that:

    1. Any indexing data structure for rank queries on B using word size w,index size rbits, and query time t must satisfy2(2r + log(w + 1))tw n log(w + 1).

    2. Any indexing data structure for select queries on B using word size w,index size rbits, and query time t must satisfy 3(r + 2)(tw + 1) n.

    In particular, for the case t = O(1) and w = O(log n),Theorem 3 immediately im-plies the lower boundsr = (n log logn/ log n) for rank indexing data structuresandr = (n/ log n) for select indexing data structures. Using a counting argument basedon binary choices trees, these lower boundswere strengthened by Golynski [11] as follows:

  • Succinct Representation of Bit Vectors Supporting Efficient rank and select Queries 7Theorem 4. [11] If there exists an algorithm for either rank or select queries on Bwhichreads O(log n) different positions of B, has unlimited access to an index of size r bits, andis allowed to use unlimited computationpower, then r = (n log log n

    log n ).

    Hence, the upper bounds given in Theorems 1and 2 are asymptotically optimal.Notethat Theorem 4 is very general;it does not impose any restrictions on the running time orrequire the readpositions of B to be consecutive for the lower bound to hold.

    Theorem 5. [11] Suppose that B has exactly m positions set to 1 for some integer m.Ifthere exists an algorithm for either rank or select queries on Bwhich reads at most tdifferent positions of B, has unlimited access toan index of size r bits, and is allowed to useunlimited computation power,then r = (mt log t).

    4. GeneralizationsThe indexing data structures in Sections 2. and 3. assume that the bit vector B is always

    stored explicitly.However, the space used by this type of encoding is far from optimal if thenumber of 1s in B is much smaller than n, or close to n.This is because the number of bitvectors of length n having m 1s is

    (nm

    ) 2nH0 , whereH0 = mn log nm + nmn log nnm isthe 0th order entropy of the bit vector,which may be much less than 2n, the number ofdistinct bit vectors oflength n.In fact, there exist data structures for rank /selectusing onlynH0 + O(n log log n/ logn) bitsto store B such that any consecutive O(log n) bits of Bcanstill be retrieved in constant time2:

    Theorem 6. [28] For a bit vector B of length n with m 1s, after O(n) timepreprocessingand using nH0 + O(n log logn/ log n) bits,where H0 = mn log

    nm +

    nmn log

    nnm ,each

    rank1, rank0, select1, and select0 query can beanswered in O(1) time.Moreover, any con-secutive O(log n) bits of B can be retrieved inO(1) time.

    The rank /select data structures can be extended to non-binary vectors.A string S oflength n over an alphabet A is a vector S[1..n]such that S[i] A for 1 i n.Let be the alphabet size, i.e., = |A|.We assume that A is an integer alphabet of theform{0, 1, . . . , 1} and that n.(Without loss of generality, we further assume that is a powerof 2.)Below, we consider succinct data structures for S supporting the following-operations for any i {1, 2, . . . , n} and c {0, 1, . . . , 1}:

    access(S, i) Return S[i]. rank c(S, i) Return the number of occurrences of c in S[1..i]. selectc(S, i) Return the position of the ith c in S.

    S may be encoded in n log bits by the obvious representationusing log bits for eachposition S[i], but there exist otherencodings which improve the rank and select querytime complexitiesat the cost of increasing the space complexity and the time needed

    2Observethat these data structures do not store B directly, so to retrieveO(log n) consecutive bits of B inO(1) time is no longertrivial.

  • 8 Jesper Jansson and Kunihiko Sadakane

    toretrieve S[i] for any given i {1, 2, . . . , n}.Hence, there is a trade-off between thesize of a data structure and theaccess/rank /select query times.Table 1 lists the perfor-mance of two straightforward datastructures D1 and D2 (explained below) and three-improved data structures proposed in [1, 12, 13]. The first straightforward data struc-

    Table 1. The trade-off between the size (in bits) and the time neededto answer eachaccess , rank c, and selectc query for various datastructures.|S| denotes the number of

    bits to encode S, H0 is the 0th orderentropy of S, and = log log log log log .

    Reference Size of data structure access time rank c time selectc timeD1 in Section 4. n(H0 + log e) + o(n) O() O(1) O(1)D2 in Section 4. |S|+ ( + 1) o(n) O(1) O(log ) O(log )[13] nH0 + log o(n) O(log ) O(log ) O(log )[12] n log + n o(log ) O(log log ) O(log log ) O(1)[1] |S|+ n o(log ) O(1) O( log log ) O()

    ture D1 stores bit vectorsV0, V1, . . . , V1 of length n such that Vc[i] = 1 if andonlyif S[i] = c, along with rank1 and select1 indexing data structuresfor these bit vec-tors.Then rank c(S, i) = rank1(Vc, i) and selectc(S, i) = select1(Vc, i),and thereforethey can be obtained in constant time.On the other hand, access requires O() time be-cause it mustexamine all of V0[i], V1[i], . . . , V1[i].Each bit vector Vc can be encodedinlog

    ( nmc

    ) mc(log e+log nmc ) bitsby Theorem 6, where mc denotes the number of cs inS.In total, the space is

    c{mc(log e+log nmc )+O(n log log n/ logn)} = n(H0+log e)+

    O(n log logn/ log n) bits. The second straightforward data structure D2 stores S explic-itlyin n log bits.In addition, it stores a rank1 and select1 indexing data structure foreachof the bit vectors V0, V1, . . . , V1 of D1.The bit vectors V0, V1, . . . , V1 are not storedexplicitly,so to answer rankc and selectc queries,D2 must have a method to compute anyconsecutive logn bits of Vcthat are required by the indexing data structure for Vc.This canbe done in O(log ) time by repeating the following steps2 log times,each time obtaining12log n bits of Vc:In O(1) time, read 12 logn consecutive bits from Sand put them in a bit

    vector r.To find the 12log n bits of Vc that correspond to r,let s be the bit vector of length

    12log n consisting of1

    2log n copies of the length-(log )pattern 000 . . . 01, let t be s multi-

    plied by c, and and let u bethe bitwise exclusive-or between r and t.Note that for any non-negative integer i,the length-(log ) pattern of u starting at positionilog equals 000 . . . 00if and only if the correspondingposition in S contains the symbol c.Finally, look up entry uin a table having 21/2 logn =

    nentries to obtain a bit vector of size 1

    2log ncontaining

    a 1 in position i if and only ifu[(i log )..((i + 1) log ) 1] = 000 . . . 00.Thus,rank c and selectc take O(log ) time.The access query takes constant time because Sis explicitly stored.The total space is that of storing S plus O(n log log n/ logn) bitsfor the rank1 andselect1indexing data structures, plus the size of the lookup table whichisn 1

    2log n = o(n) bits. In Table 1, |S| denotes the number of bits to encode S. It

    is n log if S is not compressed; however, it can be reduced byapplying a compressionalgorithm which supports instant

    Theorem 7. There exists a succinct data structure for storing a string S[1..n] over an

  • Succinct Representation of Bit Vectors Supporting Efficient rank and select Queries 9alphabet A = {0, 1, . . . , 1} in

    nHk +O

    (n(log log n+ k log )

    log n

    )

    bits for any k 0, where Hk is the kth order empiricalentropy of S, such that any substringof the formS [i . . . i+O(log n)]with i {1, 2, . . . , n} can be decoded in O(1) time ontheword-RAM.

    By using this theorem, we can compress S into nHk + o(n log )bits.Furthermore,we can regard the compressed data as an uncompressed string.Therefore the query time inTable 1 does not change.

    5. Concluding Remarks

    Succinct data structures that support efficient rank and select querieson bit vectorsand non-binary vectors are important because they form thebases of several other morecomplex data structures.Some examples include succinct data structures for represent-ingtrees [2, 6, 10, 16, 17, 22, 23],graphs [3, 15],permutations and functions [21, 24],textindexes [7, 14, 29, 30],prefix or range sums [26],and polynomials and others [9].In thesedata structures, a typical use of rank and select querieson bit vectors is to encode point-ers to blocks of data.For example, suppose that to compress some data we partition it intoblocks,compress each block independently into a variable numbers of bits, andconcate-nate the result into a bit vector C.Then we can use another bit vector B[1..|C|] such thatB[i] = 1 if and onlyif the ith bit of C is the starting position of a block, and applyselect1queries on B to find the correct starting and ending positionsin C when decompressingthe data corresponding to a particular block.Some directions for further research includedynamization to supportoperations that allow B to be modified online [27],proving lowerbounds on the size of succinct datastructures [5, 11, 19](note that the lower bounds shownin Section 3. hold only ifthe bit vector is stored explicitly using n bits, and thus do notholdfor bit vectors stored in a compressed form),and practical implementation [25].Althoughthe sizes of the known indexing data structures for bit vectors areasymptotically optimal,the o(n) additional space needed by an index isoften too large for real data and cannot beignored.Therefore, for practical applications, it is crucial to develop otherimplementationsof succinct data structures.Another open problem involves access/rank /selectoperations onnon-binary vectors.No single data structure listed in Table 1 supports constanttime access ,rank and select queries.What are the best possible lower and upper bounds on the numberof bitsrequired to achieve this?Finally, a related topic is compressed suffix arrays [14],whichare data structures for efficient substring searches.The suffix array [18] uses n logn bitsfor a string oflength n with alphabet size , while the compressedsuffix array uses onlyO(n log ) bits, which is linear inthe string size. On the other hand, the compressed suffixarray does notsupport constant time retrieval of an element of the suffix array.An impor-tant open problem is to establish whether there exists a data structureusing linear space andsupporting constant time retrieval.

  • 10 Jesper Jansson and Kunihiko Sadakane

    References

    [1] J. Barbay, M. He, J. I. Munro, and S. S. Rao. Succinct indexes for strings, binaryrelations and multi-labeled trees. In Proceedings of the 18 thAnnual ACM-SIAM Sym-posium on Discrete Algorithms (SODA 2007), pages 680689, 2007.

    [2] D. Benoit, E. D. Demaine, J. I. Munro, R. Raman, V. Raman, and S. S. Rao. Repre-senting Trees of Higher Degree.Algorithmica, 43(4):275292, 2005.

    [3] D. K. Blandford, G. E. Blelloch, and I. A. Kash. Compact representations of sepa-rable graphs. In Proceedings of the 14 th Annual ACM-SIAM Symposium on DiscreteAlgorithms (SODA 2003), pages 679688, 2003.

    [4] D. Clark.Compact Pat Trees.PhD thesis, The University of Waterloo, Canada, 1996.

    [5] E. D. Demaine and A. Lopez-Ortiz. A Linear Lower Bound on Index Size for TextRetrieval.Journal of Algorithms, 48(1):215, 2003.

    [6] P. Ferragina, F. Luccio, G. Manzini, and S. Muthukrishnan. Structuring labeled treesfor optimal succinctness, and beyond. In Proceedings of the 46 th Annual IEEE Sym-posium on Foundations of Computer Science (FOCS 2005), pages 184196, 2005.

    [7] P. Ferragina and G. Manzini. Indexing compressed texts. Journal of the ACM,52(4):552581, 2005.

    [8] P. Ferragina and R. Venturini. A simple storage scheme for strings achieving entropybounds.Theoretical Computer Science, 372(1):115121, 2007.

    [9] A. Gal and P. B. Miltersen. The cell probe complexity of succinct data structures.In Proceedings of the 30 th International Colloquium on Automata, Languages andProgramming (ICALP 2003), volume 2719 of Lecture Notes in Computer Science,pages 332344. Springer-Verlag, 2003.

    [10] R. F. Geary, N. Rahman, R. Raman, and V. Raman.A simple optimal representation forbalanced parentheses.In Proceedings of the 15 thAnnual Symposium on CombinatorialPattern Matching (CPM 2004), volume 3109 of Lecture Notes in Computer Science,pages 159172. Springer-Verlag, 2004.

    [11] A. Golynski.Optimal lower bounds for rank and select indexes. Theoretical ComputerScience, 387(3):348359, 2007.

    [12] A. Golynski, J. I. Munro, and S. S. Rao. Rank/select operations on large alphabets: atool for text indexing. In Proceedings of the 17 th Annual ACM-SIAM Symposium onDiscrete Algorithms (SODA 2007), pages 368373, 2006.

    [13] R. Grossi, A. Gupta, and J. S. Vitter. High-order entropy-compressed text indexes.In Proceedings of the 14 th Annual ACM-SIAM Symposium on Discrete Algorithms(SODA 2003), pages 841850, 2003.

  • Succinct Representation of Bit Vectors Supporting Efficient rank and select Queries 11[14] R. Grossi and J. S. Vitter. Compressed Suffix Arrays and Suffix Trees with Applica-

    tions to Text Indexing and String Matching.SIAM Journal on Computing, 35(2):378407, 2005.

    [15] G. Jacobson.Space-efficient static trees and graphs.In Proceedings of the 30 thAnnualSymposium on Foundations of Computer Science (FOCS 1989), pages 549554, 1989.

    [16] J. Jansson, K. Sadakane, and W.-K. Sung. Ultra-succinct Representation of OrderedTrees. In Proceedings of the 18 th Annual ACM-SIAM Symposium on Discrete Algo-rithms (SODA 2007), pages 575584, 2007.

    [17] H.-I. Lu and C.-C. Yeh. Balanced Parentheses Strike Back. To appear in ACM Trans-actions on Algorithms, 2008.

    [18] U. Manber and G. Myers.Suffix arrays: A New Method for On-Line String Searches.SIAM Journal on Computing, 22(5):935948, October 1993.

    [19] P. B. Miltersen.Lower bounds on the size of selection and rank indexes.In Proceedingsof the 16 th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2005),pages 1112, 2005.

    [20] J. I. Munro. Tables. In Proceedings of the 16 th Conference on Foundations of Soft-ware Technology and Theoretical Computer Science (FSTTCS 1996), volume 1180 ofLecture Notes in Computer Science, pages 3742. Springer-Verlag, 1996.

    [21] J. I. Munro, R. Raman, V. Raman, and S. S. Rao. Succinct representations of permu-tations.In Proceedings of the 30 th International Colloquium on Automata, Languagesand Programming (ICALP 2003), volume 2719 of Lecture Notes in Computer Sci-ence, pages 345356. Springer-Verlag, 2003.

    [22] J. I. Munro and V. Raman. Succinct representation of balanced parentheses and statictrees.SIAM Journal on Computing, 31(3):762776, 2001.

    [23] J. I. Munro, V. Raman, and S. S. Rao. Space efficient suffix trees. Journal of Algo-rithms, 39(2):205222, 2001.

    [24] J. I. Munro and S. S. Rao. Succinct Representations of Functions. In Proceedingsof the 31 st International Colloquium on Automata, Languages and Programming(ICALP 2004), volume 3142 of Lecture Notes in Computer Science, pages 10061015. Springer-Verlag, 2004.

    [25] D. Okanohara and K. Sadakane. Practical Entropy-Compressed Rank/ Select Dictio-nary. In Proceedings of the Workshop on Algorithm Engineering and Experiments(ALENEX 2007), 2007.

    [26] C. K. Poon and W. K. Yiu. Opportunistic Data Structures for Range Queries. In Pro-ceedings of Computing and Combinatorics, 11 th Annual International Conference(COCOON 2005), volume 3595 of Lecture Notes in Computer Science, pages 560569. Springer-Verlag, 2005.

  • 12 Jesper Jansson and Kunihiko Sadakane

    [27] R. Raman, V. Raman, and S. S. Rao. Succinct dynamic data structures. In Proceed-ings of Algorithms and Data Structures, 7 th International Workshop (WADS 2001),volume 2125 of Lecture Notes in Computer Science, pages 426437. Springer-Verlag,2001.

    [28] R. Raman, V. Raman, and S. S. Rao.Succinct indexable dictionaries with applicationsto encoding k-ary trees, prefix sums and multisets. ACM Transactions on Algorithms,3(4):Article 43, 2007.

    [29] K. Sadakane. New Text Indexing Functionalities of the Compressed Suffix Arrays.Journal of Algorithms, 48(2):294313, 2003.

    [30] K. Sadakane. Compressed Suffix Trees with Full Functionality. Theory of ComputingSystems, 41(4):589607, 2007.

  • In: Software Engineering and Development ISBN: 978-1-60692-146-3 Editor: Enrique A. Belini, pp. 13-22 2009 Nova Science Publishers, Inc.

    Expert Commentary B

    HETEROGENEITY AS A CORNER STONE OF SOFTWARE DEVELOPMENT IN ROBOTICS

    Juan-Antonio Fernndez-Madrigala, Ana Cruz-Martnb, Cipriano Galindoc and Javier Gonzlezd System Engineering and Automation Department,

    University of Mlaga (Spain)

    Abstract

    In the last years the complexity of robotic applications has raised important problems, particularly in large and/or long-term robotic projects. Software engineering (SE) seems the obvious key for breaking that barrier, providing good maintenance and reusing, coping with exponential growth of programming effort, and integrating diverse components with guarantees. Suprisingly, SE has never been very relevant within the robotic community. In this text we briefly describe some causes for that, review the evolution of robotic software over time, and provide some insights from our most recent contributions. We have found that many problems arising from the conflicts raised by robotic complexity can be well addressed from a SE perspective as long as the focus is, at all levels, on the heterogeneity of components and methodologies. Therefore we propose heterogeneity as one of the corner stones of robotic software at present.

    1. Introduction

    Robots are mechatronic systems [2], therefore they integrate electromechanical devices and software. Although software has usually contributed to robotics with its plasticity when compared to implementations on bared hardware, some of its limits, mostly computational

    a E-mail address: [email protected] b E-mail address: [email protected] c E-mail address: [email protected] d E-mail address: [email protected]

  • Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo et al. 14

    complexity [19] and its intrinsic nature as a manipulator of pre-existing symbols, have also limited the finding of practical solutions to some robotic problems+.

    In the last decades, a new limit of robotic software has been more and more evident when robotic projects have become large (for example, the development of complete robot architectures [18] or multirrobot systems [3]) and/or long-term. This kind of projects necessarily have to cope with the integration of very diverse components, ontologies, and methods in a guaranteed, maintainable and extensible way. It is clear at present that, under those requirements, sophisticated software methodologies are unavoidable for breaking the barrier of complexity without sacrificing robotic dependability.

    In particular, software engineering (SE) seems the obvious key for helping in these issues, namely by providing good maintainance and reusing and integrating with guarantees components that are diverse. Suprisingly, SE has never been considered very relevant within the robotics community, as demonstrated by the lack of specialized journals in the area and the reduced number of workshops on SE organized by robotics people. A few reasons could be: the small scale of most robotic applications until more or less one decade ago, the strongly reductionist methodology used in typical robotic problems, and, in cases, the improper belief that SE has nothing to do with "pure" robotics.

    Only a few research groups within robotics have proposed a number of tools more or less inspired in SE methodologies during the last twenty years (see section 2). These tools have set the basis for a deeper knowledge of the limits and characteristics of robotic software, but they have not formed yet a consistent solution that covers all aspects of such software.

    Recently, we have proposed a characteristic that serves to differentiate robotic software: heterogeneity ([10], [12], [13]). We have found that many problems arising from the conflicts between robotic complexity and dependability can be well addressed from a SE perspective as long as heterogeneity is included, as a core feature, at all levels of development, instead of forcing the use of a common solution and set of methodologies. Therefore, we have constructed a minimalistic framework for covering the main stages of heterogeneous software development: design, implementation, validation, debugging, and maintenance. This framework, called BABEL, has demonstrated that allowing a high level of heterogeneity at all the levels of development facilitates the achievement of modern robotic goals.

    In the following we explore the evolution of robotic software (section 2) and show, through BABEL, the role of heterogeneity as one of the corner stones of robotic software (sections 3 and 4). Since particular and detailed examples and results have been reported elsewhere during the last years, we focus here on the essential aspects of our approach.

    2. The Main Stages in the Evolution of Robotic Software

    As the robotics realm itself, robotic software has continually evolved, running in parallel to the hardware and software technologies available at the moment. For summarizing the main trends, we identify here three different stages in time, each of them characterized by a particular software issue that received certain effort from the robotics research community in order to solve the robotic problems at the time. Nevertheless, these stages should be + For instance, efficient solutions to the Simultaneous Localization and Mapping problem suffer from software

    complexity [17]. Also, the autonomous acquisition of symbols from sub-symbolic information [8] is still an open issue.

  • Heterogeneity as a Corner Stone of Software Development in Robotics 15

    understood as a convenient discretization of a continuous process; thus it is not rare that the works mentioned here can be considered to belong to more than one.

    The first stage we can set in the evolution of robotic software, that we could call raw robotic programming, covered from the lately sixties until the late eighties of the XX century. In that period robotics programming was limited to solutions based on direct hardware implementation of algorithms [4] or ad-hoc programming of concrete hardware platforms [23], [32]. Most of the development of programming languages for robots was focused during that period on industrial manipulators [25], although those languages were of a very low level (close to assembler).

    In a second stage, that we could call middleware robotic programming and extended around the early nineties of the XX century, the goal for the robotic software developers shifted to provide pre-defined software platforms to control the physical devices of the robot (actuators and sensors); in spite of this software being tightly coupled to specific hardware, it alleviated the, until the moment, heavy task of programming complex robots. In this period, some robotic software was in fact real-time operative systems, like ALBATROSS [36], Harmony [20], or Chimera II [33]. But this stage do not stopped there: these platforms led to the ability of a more complex processing, and, accordingly, the notion of robotic control architecture (a set of software elements or modules that worked together in order to achieve a robotic task) also received attention by the robotics community+. So, the first years of that 90's decade also offered interesting architectures like TCA [29] or NASREM [1]. Since then, architecture solutions were continuously released to the robotics arena: e.g., new robotics fields (for example, multirobots) demanded their own architectural approaches ([24]).

    Finally, we can distinguish a last stage of robotics software that embraced from the mid nineties to present and can be called robotics software engineering. The key point at this stage is that some SE aspects are considered when programming robots, mainly due to the still increasing complexity of robotic applications. Now, the goal is not to produce a closed or static architecture, but a framework that allows the developer to produce the architectural solution he/she may need in his/her particular situation. Examples of free, commercial, and/or academical frameworks are ORCCADD [30], Cimetrix's CODE [7], RTI's ControlShell [28], GeNoM [16], NEXUS [9] (a previous instance of our current BABEL development system), OSACA [31], OROCOS [35], Player/Stage [26], CARMEN [37], MARIE [38], RobotFlow [25], CLARAty [13], or Microsoft Robotics Studio [22]. Different SE concepts -like object-oriented programming, software lifecycle, software validation, reusability, CASE tools, or automatic code generation- are being progressively included into these frameworks. However, not all of these focus on SE in the same manner or intensity. In particular, it is very common that they are not able to deal with heterogeneity in a desirable way, which is our aim with BABEL.

    + Notice that a robot software architecture can be conceptually seen nowadays as a robotic middleware. For more details on the current state of the art of robotic software, you can consult [5].

  • Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo et al. 16

    3. Towards a Heterogeneity-Based Robotic Software Development System

    Currently, large and/or long-term robotic projects involve many different researchers with very different programming needs and areas of research, using a variety of hardware and software that must be integrated efficiently (i.e., with a low development cost) in order to construct applications that satisfy not only classic robotic requirements (fault-tolerance, real-time specifications, intensive access to hardware, etc.) but also software engineering aspects (reusability, maintainability, etc.). This indicates three main sources of heterogeneity: hardware, software, and methodological. They appear with different strength at each stage of the robotic application lifecycle: analysis, design, verification, implementation, validation, debugging, maintainance...

    Our aim with the identification and inclusion of heterogeneity as one of the pervasive features of robotic applications is to set the basis for a comprehensive software development framework, in the sense that it covers all the stages of the robotic software lifecycle. Up to now, we have reported, in the context of our BABEL development system, tools and methodologies for stages that are present in the most common models for software development (Waterfall, Iterative, etc. [27]), which are described in the following.

    3.1. Robotic Software Design

    Software design consists of finding a conceptual solution to a given problem in terms of software components and methodologies. From the heterogeneity perspective, the design of a robotic application should be the foundation for integrating diverse elements while guaranteeing certain requirements (produced by a previous stage of analysis, not covered here). The problem in robotic systems is that there is no wide standardization for components, and thus, forcing the use of one standard or of some unique framework is difficult to achieve.

    Our philosophy is the opposite to that: we consider heterogeneity in components as a core feature of the design framework, that is to be preserved. Thus, we maintain the framework to a minimum, stablishing the smallest structural and behavioral ontologies of design that allow us to express the most important requirements in a robotic application without sacrificing diversity.

    BABEL provides a heterogeneity-based specification for the design of robotic applications, currently called Aracne. Aracne is based on three design ontologies: structural, behavioral, and the structural-behavioral link. Each of these contains a minimalistic set of specifications where most of the heterogeneity present in a complex robotic application can fit. Recently, Aracne is being extended as a specification language, called H [12].

    The main features of Aracne/H are: Clear identification of both the software components that are portable and those that

    are tied to some platform. The former have a low heterogeneity level, while the latter comprise most of the heterogeneity present in the application.

    The structural ontology is based on active objects or components [34], called modules, that provide certain services to other modules and maintain an internal

  • Heterogeneity as a Corner Stone of Software Development in Robotics 17

    status. This ontology is different from the one of the object oriented paradigm, since ours includes execution and intercommunication models explicitly (concurrency, synchronization, client-server/subscriber-publisher behavior, etc.).

    The inclusion of an execution model into the structural ontology links it with the behavioral ontology, where the logic of services is specified. Aracne/H permits us to design the logic (code) of the modules in different programming languages or visual methodologies, in a way that isolates and highlights heterogeneity.

    The structural-behavioral link makes up a complete design for the application. We currently include in the ontology for this link basic fault-tolerance mechanisms in the form of software replication [21].

    The three ontologies of Aracne/H include the explicit specification of the most important robotic requirements, namely: hardware dependencies, real-time, and fault-tolerance.

    Aracne/H allows us to design, with the same specification, both distributed and non-networked applications, hard, soft, and mixed real-time systems, modules with different programming paradigms, etc.

    Finally, due to its intrinsic minimalistic nature, the specification is open to cope with most off-the-shelf components and with their evolution in time, important concerns in the component-based software engineering field (CBSE) [34].

    Currently we have a visual designing tool, called the Module Designer, that implements

    the Aracne specification and allows us to make up the design of the application easily, integrating different programming languages and platforms. We have developed a number of robotic applications over the last decade using Aracne (see for example [14], [15], [13], [12]), obtaining important benefits, as summarized in section 4.

    3.2. Robotic Software Implementation

    One of the most relevant differences of BABEL with respect to other approaches is that it considers into the design all the elements of the application, which includes behavior, i.e., code. That allows us to develop tools for generating implementations almost directly from the design.

    The Module Designer of the BABEL development system is a CASE tool that not only facilitates the heterogeneous design, but is also able to transform automatically that design into an implementation, which can be supported by heterogeneous execution platforms. Its main features are:

    It provides a user-friendly integrated development environment for visually

    designing modules according to the Aracne/H specification, specifying their public interfaces, services, codifications and dependencies.

    The tool automatically generates the software for converting that design into a complete executable program and for the integration of this program into a (possibly distributed) robotic application composed of other modules.

  • Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo et al. 18

    It includes the possibility of generating implementations for a given set of particular platforms, and is extensible to platforms not covered yet. Examples of the platforms and languages supported is reported elsewhere [13].

    The Module Designer also includes logging and debugging facilities that can be placed at

    critical paths in the logic (code) of modules. This links the implementation to the verification and validation stages.

    3.3. Robotic Software Verification and Validation

    The goal of software verification is to guarantee that a given design/implementation

    satisfies all its requirements. In a design made with the Aracne specification it is easy to check a number of possible pitfalls during design and also conflicts between dependencies during implementation, independently on the highly heterogeneous nature of the components that integrate the application. This includes:

    Checking the possibility of satisfying the real-time requirements of the different

    components (currently simplified to WCETs -Worse-Case Execution Times-). Checking if all the platforms needed for the satisfaction of the requirements are

    present in the deployment (otherwise, the Module Designer adapts the implementation for reduced requirement satisfaction).

    Checking for some limited kinds of dependency cycles that could end in deadlocks. Also, the information present in the design is enough to carry out scheduling analysis

    when the application needs hard real-time. On the other hand, the goal of software validation is to check if a given

    design/implementation satisfies the intended application in practice. A debugger tool is included in BABEL that retrieves execution information for carrying out off-line analysis of the real-time performance for the cases where that information reflects robotic goals (for example, in navigation modules that need to react to the environment at predefined times). Sometimes, this tool also serves for verification, discovering errors or faults in programming.

    3.4. Robotic Software Maintenance

    Large and/or long-term robotic projects cannot be carried out efficiently without software bootstrapping, that is, without reusing. However, it is quite evident the still present trend in robotics of repeating the same programming effort time after time for different platforms or control architectures. As long as this way of working is used, the development of complex robotic application will be severely handicapped.

    The Aracne design specification, and in particular its most recent extension H, is aimed to reusing by including some characteristics of the object-oriented paradigm. For instance, in H inheritance has been appropriately adapted to the heterogeneity of the design: the structural

  • Heterogeneity as a Corner Stone of Software Development in Robotics 19

    design of a module can be separately inherited from the behavioral design, allowing us to maintain a repository of logical interfaces and a set of behavioral logics (code) that fit into them. In summary, inheritance allows us to specialize previous developments to new necessities, while the separation between the structural and the codification design isolates the changes that have to be made due to evolution of hardware or software.

    For coping with the important amount of information that this approach generates (which must be appropriately stored in any large robotic project), BABEL also includes a tool for maintenance. This tool is a web site under development [11] that holds all the designs produced up to now, classified through a simple versioning system, and accessable by the members of our group and their collaborators, since the data held there belongs to particular research projects. Nevertheless, some of the modules and results, and all the documentation and the most tested configuration of BABEL, are freely available from the site.

    4. Results, Conclusions, and Future Work

    Along this text we have set one of the most important characteristics of robotic software, that differentiates it from other kinds of software applications: heterogeneity. This heterogeneity is to be understood as a pervasive characteristic: every stage and every level of detail in the development of a robotic application should deal appropriately with diverse components and methodologies.

    Based on this idea we have described our approach to the treatment of heterogeneity from a software engineering perspective. Our solution is a minimalistic software development system, called BABEL, that is aimed to cover the most important stages of the robotic software lifecycle. We have used BABEL for our robotic projects during the last decade. The benefits have been evident:

    It has made possible to break the complexity barrier in current robotic applications,

    mainly by enforcing the reuse of software and eliminating the completely re-programming of algorithms for each new platform.

    It has allowed us to guarantee the most relevant requirements of robotics applications, mostly regarding dependability.

    The time effort dedicated to robotic programming has been, in general, transformed from exponential into linear [13].

    BABEL has allowed us to develop very different research projects with very different requirements, hardware components (we have a number of different robots in our laboratories), and software (operating systems, libraries, etc.), without slowing down the development due to component evolvement and diversity.

    Our system has enabled the integration of people with very different skills into interdisciplinary groups of research (from under-graduate students to professors).

    Part of the effort that was not spent in re-programming existing applications for new

    robots is now devoted to maintaining the BABEL system. Thus, we are working in including new hardware and software components as we acquire new equipment, and also in adjusting in a continuous basis the ontologies of the Aracne/H specification to cover the most diversity we can while maintaining them as minimalistic as possible.

  • Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo et al. 20

    However, this kind of effort has important drawbacks for us as robotic researchers. In an interesting article, Bruyninckx [6] mentions some of them:

    The small interest that a good software practice has for the robotics researcher, since

    it cannot be translated into citation index or other tangible results. Fundings are currently more centered on fundamental research, and not in

    middleware projects that, though do not offer new or original results, are the basis for a well-supported robotics research in the long term.

    Robotics experts are often not really interested in those advances that software engineering could offer to the developers of robotic software.

    From an optimistic point of view, it is clear that the interest on software engineering in

    robotics is increasing in the last years. Although there is much work to do to consolidate this trend, just the unavoidable necessity of breaking the complexity barrier in software development in order to build the robotic applications of the present and future should be enough for changing the robotic software status within the community.

    References

    [1] Albus J.S., Quintero R., Lumia R. An overview of Nasrem - The NASA/NBS standard reference model for telerobot control system architecture. NISTIR 5412, National Institute of Standards and Technology, Gaithersburg, Md.. 1994.

    [2] Appukutam K.K., Introduction to Mechatronics, Oxford University Press, ISBN 978-0195687811, 2007.

    [3] Balch T., Parker L.E. (eds), Robot Teams: from Diversity to Polymorphism, AK Peters, ISBN 1-56881-155-1, 2002.

    [4] Brooks R.A. A Robust Layered Control System for a Mobile Robot. IEEE Journal of Robotics and Automation, Vol. RA-2, no. 1. 1986.

    [5] Brugali D. Software Engineering for Experimental Robotics. Springer STAR. 2007. [6] Bruyninckx, H. Robotics Software: The Future Should Be Open. IEEE Robotics and

    Automation Magazine, March 2008, pp. 9-11. 2008. [7] Cimetrix CODE. http://www.cimetrix.com/code.cfm. 2008. [8] Coradeshi S., Saffiotti A., An Introduction to the Anchoring Problem, Robotics and

    Autonomous Systems, vol. 43, no. 2-3, pp. 85-96, 2003. [9] Fernndez J.A., Gonzlez J. The NEXUS Open System for Integrating Robotic

    Software. Robotics and Computer-Integrated Manufacturing, Vol. 15(6). 1999. [10] Fernndez-Madrigal J.A. Galindo C., Gonzlez J., Integrating Heterogeneous Robotic

    Software, 13th IEEE Mediterranean Electrotechnical Conference (MELECON), Benalmdena-Mlaga (Spain), May 16-19, 2006

    [11] Fernndez-Madrigal J.A., Cruz-Martn E., The BABEL Development Site, http://babel.isa.uma.es/babel2, 2008.

    [12] Fernndez-Madrigal J.A., Galindo C., Cruz A., Gonzlez J., A Software Framework for Coping with Heterogeneity in the Shop-Floor, Assembly Automation no. 4, vol. 27, pp. 333-342, ISSN 0144-5154, 2007.

  • Heterogeneity as a Corner Stone of Software Development in Robotics 21

    [13] Fernndez-Madrigal J.A., Galindo C., Gonzlez J., Cruz E., and Cruz A., A Software Engineering Approach for the Development of Heterogeneous Robotic Applications, Robotics and Computer-Integrated Manufacturing, vol. 24, no. 1, pp. 150-166, ISSN 0736-5845, 2008.

    [14] Fernndez-Madrigal J.A., Gonzlez J., A Visual Tool for Robot Programming, 15th IFAC World Congress on Automatic Control, Barcelona, Spain, July 2002.

    [15] Fernndez-Madrigal J.A., Gonzlez J., NEXUS: A Flexible, Efficient and Robust Framework for Integrating the Software Components of a Robotic System, IEEE International Conference on Robotics and Automation (ICRA'98), Leuven, Belgium, May 1998

    [16] Fleury S., Herrb M., Chatila R. GenoM: A Tool for the Specification and the Implementation of Operating Modules in a Distributed Robot Architecture. Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS'97). 1997.

    [17] Fresse U., Larsson U., Duckett T., A Multilevel Relaxation Algorithm for Simultaneous Localization and Mapping, IEEE Transactions on Robotics, vol. 21, no. 2, pp. 196-207, 2005.

    [18] Galindo C., Gonzlez J., Fernndez-Madrigal J.A., Control Architecture for Human-Robot Integration. Application to a Robotic Wheelchair, IEEE Transactions on Systems, Man, and Cybernetics part B, vol. 36, no. 5, pp. 1053-1067, 2006.

    [19] Garey M.R., Johnson D.S., Computers and Intractability: A Guide to the Theory of NP-Completeness, Freeman (ed.), ISBN 978-0716710455, 1979.

    [20] Gentleman W.M., MacKay S.A., Stewart D.A., Wein M. An Introduction to the Harmony Realtime Operating System. Newsletter of the IEEE Computer Society Technical Committee on Operating Systems. 1988.

    [21] Guerraoui R., Schiper A., Software-Based Replication for Fault Tolerance, IEEE Computer. Vol. 30, no. 4., 1994

    [22] Microsoft Robotics Studio. http://msdn.microsoft.com/en-us/robotics/default.aspx. 2008. [23] Mitchell T.M. Becoming Increasingly Reactive. Proceedings of the AAAI Conference.

    1990. [24] Parker L.E. ALLIANCE: An Architecture for Fault Tolerant Multi-Robot

    Cooperation. IEEE Transactions on Robotics and Automation, 14 (2). 1998. [25] Paul R.P., Robot Manipulators: Mathematics, Programming and Control, MIT Press,

    ISBN 0-262-16082-X,1981. [26] Player Project. http://playerstage.sourceforge.net/. 2008. [27] Pressman R.S., Software Engineering. A Practitioner's Approach, 6th edition,

    McGraw-Hill, ISBN 978-0073019338, 2004. [28] Schneider S.A., Ullman M.A., Chen V.W. ControlShell: a real-time software

    framework. IEEE International Conference on Systems Engineering. 1991. [29] Simmons R., Lin L.-J., Fedor C. Autonomous Task Control for Mobile Robots (TCA).

    Fifth IEEE International Symposium on Intelligent Control. 1990. [30] Simon D., Espiau B., Castillo E., Kapellos K. Computer-Aided Design of a Generic

    Robot Controller Handling Reactivity and Real-Time Control Issues. Rapports de Recherce n 1801, Programme 4: Robotique, Image et Vision, INRIA. 1992.

    [31] Sperling W., Lutz P. Enabling Open Control Systems An Introduction to the OSACA System Platform. Robotics and Manufacturing, Vol. 6, ASME Press New York. 1996.

  • Juan-Antonio Fernndez-Madrigal, Ana Cruz-Martn, Cipriano Galindo et al. 22

    [32] SRI: Shakey the Robot. http://www.sri.com/about/timeline/shakey.html. 2008. [33] Stewart D.B., Schmitz D.E., Khosla P.K. The Chimera II Real-Time Operating System

    for Advanced Sensor-Based Control Applications. IEEE Transactions on Systems, Man and Cybernetics, vol. 22, no. 6, Nov/Dec. 1992.

    [34] Szyperski C., Gruntz D., Murer S., Component Software: Beyond Object-Oriented Programming. Boston, Ma., 2nd edition, Addison-Wesley, ISBN 0201745720, 2002.

    [35] The Orocos Project. http://www.orocos.org/. 2008. [36] Von Puttkamer E., Zimmer U.R. ALBATROSS: An Operating-System under Realtime-

    Constraints. Real-Time magazine, Diepenbemmd 5 1650 Beersel Belgium, Vol. 5, no. 3, 91/3. 1991.

    [37] http://carmen.sourceforge.net, 2007 [38] http://marie.sourceforge.net, 2007 [39] http://robotflow.sourceforge.net, 2005 [40] http://claraty.jpl.nasa.gov, 2008

  • SHORT COMMUNICATIONS

  • In: Software Engineering and Development ISBN: 978-1-60692-146-3Editor: Enrique A. Belini, pp. 25-35 2009 Nova Science Publishers, Inc.

    Short Communication A

    EMBEDDING DOMAIN-SPECIFIC LANGUAGESIN GENERAL-PURPOSE

    PROGRAMMING LANGUAGES

    Zoltn dm MannAAM Consulting Ltd.; Budapest University of Technology and Economics

    Abstract

    In recent years, domain-specific languages have been proposed for modelling applications ona high level of abstraction. Although the usage of domain-specific languages offers clearadvantages, their design is a highly complex task. Moreover, developing a compiler orinterpreter for these languages that can fulfil the requirements of industrial application is hard.Existing tools for the generation of compilers or interpreters for domain-specific languages arestill in an early stage and not yet appropriate for the usage in an industrial setting.

    This paper presents a pragmatic way for designing and using domain-specific languages.In this approach, the domain-specific language is defined on the basis of a general-purposeprogramming language. Thus, general programming mechanisms such as arithmetics, stringmanipulations, basic data structures etc. are automatically available in the domain-specificlanguage. Additionally, the designer of the domain-specific language can define furtherdomain-specific constructs, both data types and operations. These are defined withoutbreaching the syntax of the underlying general-purpose language. Finally, a library has to becreated which provides the implementation of the necessary domain-specific data types andoperations. This way, there is no need to create a compiler for the new language, because aprogram written in the domain-specific language can be compiled directly with a compiler forthe underlying general-purpose programming language. Therefore, this approach leverages theadvantages of domain-specific languages while minimizing the effort necessary for the designand implementation of such a language.

    The practical applicability of this methodology is demonstrated on a case study, in whichtest cases for testing electronic control units are developed. The test cases are written in a newdomain-specific language, which in turn is defined on the basis of Java. The pros and cons ofthe presented approach are examined in detail on the basis of this case study. In particular, it isshown how the presented methodology automatically leads to a clean software architecture.

  • Zoltn dm Mann26

    1. Introduction

    In the last decades, the requirements toward software have become tougher and tougher.The complexity of the problems that are solved by software is growing, while at the sametime the expectations concerning numerous other, non-functional, aspects (for instance,maintainability, usability, fault-tolerance, parallelism, throughput etc.) have also increasedsignificantly. Moreover, in todays highly competitive software market, it is crucial tominimize time-to-market for software, to be able to quickly add fixes or new features toproducts.

    Since the human brain has not evolved significantly in this time, the only way to createmore complex software more quickly is to raise the level of abstraction for softwaredevelopment. Just imagine how it would be to develop software that should fulfil todaysrequirements, if you had to keep in mind which piece of data is in which register of theprocessor!

    In order to cope with increasing complexity, the profession moved from machine code toassembler, from assembler to high-level programming languages, then to object orientation,to component orientation etc. Today, we think in terms of high-level programmingabstractions, such as components, threads, GUI elements etc., and not in terms of what thehardware can offer (registers, memory addresses, interrupts).

    Despite all this development, the requirements are still ahead of what we can deliversafely with our current software development practices. So, what will be the next quantumleap in increasing the level of abstraction?

    Many researchers agree that the destination of this journey will be some kind of modelorientation [6]. Software development will mean creating an abstract, logical model of whatthe software is supposed to do, without technical details on how it will fulfil those aims. Asformulated by Brooks in his seminal paper No silver bullet, the essence of softwaredevelopment is the construction of abstract, conceptual structures; the difficulties arising fromthe representation of these structures within the framework of a programming language arejust accidental and are decreasing with scientific progress [1].

    There are some debates in the research community on what the future model orientedsoftware development process will look like:

    One possibility is to define a universal modelling language that can be used for thedevelopment of any software application. Most notably, the Object ManagementGroup (OMG) follows this path with the Unified Modelling Language (UML) [12].In contrast, others argue that modelling at a really high level of abstraction is onlypossible with domain-specific concepts, which can be best accomplished by adomain-specific language (DSL). In recent years, this latter approach has gainedtremendously in popularity [2] and is also the topic of this paper. More on DSLs canbe found in Section 0.

    Another question is how to bridge the gap between the abstract model and the realfeatures of the available platform. Two main approaches can be distinguished,similar to compiled vs. interpreted programming languages. The first approach

    Also, there are minor differences in the terminology, e.g. model-based vs. model-driven vs. model-oriented.

  • Embedding Domain-Specific Languages 27

    consists of generating (possibly in more than one step) program code from themodel, after which the code can be executed using traditional mechanisms. Forinstance, the OMGs Model-Driven Architecture (MDA) paradigm falls into thiscategory [11]. The other approach consists of executing the model itself with asuitable model interpreter. As an example, the Executable UML (xUML) approachbelongs to this category [9].

    When hearing the word model, one tends to think of a graphical representation, likean UML model. However, graphical modelling has its limitations. Not only is agraphical representation less appropriate for machine processing, but also for thehuman reader, it is quite hard to understand hundreds (or more) of pages of graphicalmodels. Usually, a textual model is more concise and can therefore scale better inmodel size when readability is concerned. Thus, textual modelling languages becamemore popular in recent years [5].

    In the rest of the paper, textual domain-specific languages are considered. The issue ofgenerating code from the model vs. interpreting the model itself will be discussed in moredetail.

    1.1. Paper Organization

    The rest of the paper is organized as follows. In Section 0, the concept of DSLs isdescribed in more detail, with special emphasis on the challenges associated with thedevelopment of a DSL. Section 0 contains a case study, introducing the domain of testingelectronic control units. In this domain, there is a need for a DSL for the specification of testcases. Section 0 describes the proposed pragmatic way of defining a DSL based on a general-purpose language in principle, followed by the second part of the case study in Section 0, inwhich the practical applicability of the proposed approach is presented for specifying testcases for electronic control units. Section 0 contains a discussion of the lessons learned in theapplication of the proposed methodology, while Section 0 concludes the paper.

    2. DSLs

    2.1. General Properties of DSLs

    A DSL is a language for the description of programs, or of models of programs, on aspecific field of application (called a domain). Since the language is tailored to one domain,complex constructs and abstractions of the domain can be supported directly by the language.A number of benefits are expected from this clear focus on one domain, such as:

    Concise representation of complex issues; Gain in productivity;

    From a theoretical point of view, the distinction between a program and a model of the program is artificial, since

    a model can be defined as an abstract