shall we build a tower together? - uni-bamberg.de · shall we build a tower together? a study of...

OTTO-FRIEDRICH-UNIVERSITY BAMBERGCognitive Systems Group

Shall we build a tower together?

A study of human-robot interaction withthe humanoid robot NAO

Bachelor Thesis

in the degree course Applied Computer ScienceFaculty of Information Systemsand Applied Computer Science

Author: Ioulia Kalpakoula

Supervisor: Prof. Dr. Ute Schmid

Abstract

One main aspect in robot research is the use of robots in service duty for humans. Be

it a service robot or a robot designed for educational purposes, they all have one main

aspect in common: the foundations of artificial intelligence and thus the research on

human-robot interaction in robotics.

The goal of this thesis is to present a possible implementation solution for a tower

building game with simple blocks, played by a human player and NAO, under consid-

eration of non-verbal communication aspects. Therefore it is necessary to explore the

possible interaction strategies that achieve a successful communication session between

human player and NAO.

ii

Table of content

Abstract ii

List of Figures v

1. Introduction 1

1.1. Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.2. Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1

1.3. Project structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2

2. Building a tower together 3

2.1. Game strategy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3

2.2. Object recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2.3. Goal . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

3. Human-Robot interaction 7

3.1. Foundations of HRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1.1. Robotics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7

3.1.2. Application area of robotics/human-robot interaction . . . . . . 10

3.1.3. Autonomous agents vs. humanoid robots . . . . . . . . . . . . . 12

3.2. Components of Human-Robot-Interaction . . . . . . . . . . . . . . . . . 13

3.2.1. Intelligence and Consciousness . . . . . . . . . . . . . . . . . . . 13

3.2.2. Perception and Expression . . . . . . . . . . . . . . . . . . . . . 14

3.2.3. Expressions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15

3.2.4. Manipulation and Locomotion . . . . . . . . . . . . . . . . . . . 15

3.3. State of Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17

3.4. Goal of HRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18

4. The NAO Architecture 19

4.1. Robot Specifications . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

5. Software 22

5.1. Embedded Software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22

5.2. Desktop software . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

iii

Table of content

5.3. SDK/IDE . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 24

6. Realization of tower building with NAO 25

6.1. Implementation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.1.1. Object structure in Choregraphe . . . . . . . . . . . . . . . . . 25

6.2. Improvements . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.3. Issues with Naos components . . . . . . . . . . . . . . . . . . . . . . . 29

6.3.1. NAO camera . . . . . . . . . . . . . . . . . . . . . . . . . . . . 29

6.3.2. Joints overheating . . . . . . . . . . . . . . . . . . . . . . . . . . 30

7. Conclusion 31

Bibliography 32

Appendix 35

A. Content of CD 35

iv

List of Figures

2.1. Optimal course of the game . . . . . . . . . . . . . . . . . . . . . . . . 4

2.2. Simple diagram of playing one round . . . . . . . . . . . . . . . . . . . 6

3.1. Interaction of a robot with its environment . . . . . . . . . . . . . . . . 8



4.1. NAO Parts - NAO H25 . . . . . . . . . . . . . . . . . . . . . . . . . . . 19

4.2. NAO body parts 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 20

5.1. NAOs software components2 . . . . . . . . . . . . . . . . . . . . . . . 22

5.2. NAOqi Process . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 23

5.3. Choregraph User Interface . . . . . . . . . . . . . . . . . . . . . . . . . 24

6.1. Init and Main . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25

6.2. The Init box including all initial behaviors . . . . . . . . . . . . . . . . 26

6.3. The Main box contains the core modules . . . . . . . . . . . . . . . . . 26

6.4. NAO perceives human players actions . . . . . . . . . . . . . . . . . . 27

6.5. Landmark Detection triggers Action Grabbing Block . . . . . . . . . 28

6.6. NAO Head joints3 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 30

v

1. Introduction

1.1. Motivation

Nowadays there already are various application domains in which humanoid robots are

used as service assistants to support humans. The use of robots gets more and more

essential as they provide a variety of skills to be able to cover a large field of duties.

Humanoids assist people not only in governmental tasks but also as service robots in

home environments, e.g. taking care old or handicapped people: as a caretaker for

medication intake or as a housekeeper. The further development in society requires

more qualified interaction solutions to cover upcoming deficits as labor shortage of

nurse practitioners. Hence humanoids assisting in medical centers. 1

The challenging task in human-robot interaction is the achievement in robots explo-

ration and utilization of interaction strategies. Components hereby are the perception

of verbal and non-verbal expressions of the interaction partner. Following by own ex-

pressions within the given context of interaction.

1.2. Objectives

The aim of this thesis is to develop and validate a possible solution of human-robot-

interaction by implementing a tower-building game with the humanoid robot NAO on

the basis of the standard animation software and modules.

The focus is put on NAOs ability to recognize and react on non-verbal clues as well

as his performance of grabbing a block and putting it on another one.

Therefore a recognition module has to be implemented which allows NAO to distinguish

his own blocks from the blocks of his game partner and also the blocks that have been

already placed.

NAO also has to be able to recognize whether it is his turn or not by monitoring his

game partners nonverbal signals and the change of state of the play area.

1http://www.atp.nist.gov/eao/sp950-1/helpmate.htm

last viewed: 01.05.2014

1

http://www.atp.nist.gov/eao/sp950-1/helpmate.htm

1. Introduction

1.3. Project structure

The first part will give an overview about the general aspects of human-robot interac-

tion and related work in this field with the main focus on modes of interaction.

A short introduction to the architecture of the NAO robot will especially focus on

issues of Naos components that have made the implementation more difficult. The

project has been conducted using solely Software and Hardware provided by NAO dis-

tribution.

The basis of the realization is given by an introduction into game strategy including

its position on human-robot interaction, in particular the planning of the object recog-

nition and turn detection as also their possible implementation up to a certain point.

The evaluation part presents the analysis results of the implementation (with reference

to efficiency and effectiveness) and points out faced issues and suggests improvements

referring to future work related to this tower-building game.

2

2. Building a tower together

2.1. Game strategy

Given the focus on communication strategy and also on communication pathways, the

rules of the tower-building game are kept deliberately simple.

In this thesis the rules are intended for two participants who alternately build together

a tower by stacking blocks one above the other. The following images 2.1 illustrate

how a successful game shall work.

Each player gets an equal number of small styrofoam blocks, in this case two

blocks each player were intended

Landmarks are only fixed on the blocks of the human player, so NAO can detect

where the current block is

The human player starts the game by placing one of his blocks

Note: To create an equal basis the starting order may be determined on a rotation

basis

In next turn NAO has to grab one of his blocks and place it on top of the block

that has been placed by the human participant in previous round

Now there shall be a tower of two blocks, the block has been placed by the human

participant and on top of it the block has been placed by NAO

The second round begins as the human player places another of his blocks on the

top of the tower The game goes on in turn based mode until one of the conditions

as set out below terminate the game:

1. Game terminates successfully, if all blocks of both game participants have

been placed on the tower without the tower falling off

2. Game terminates unsuccessful if the tower gets shaky and falls over the

playground

3


(a) Initial game set up (b) Human player begins and puts his block ontable

(c) NAO continues putting its block on top of theother

(d) Human puts his second and last block

(e) NAO finishes game by putting its last block

Figure 2.1.: Optimal course of the game

4


2.2. Object recognition

For the realization of the object recognition part also NAOs qualities in grabbing ob-

jects had to be taken into consideration concerning the block object attributes like size,

material and perhaps color of objects surface.

A number of solution approaches came up that were tested:

Wooden blocks, which children play with, had the perfect size but were not usable

due to the flat surface that could not be held by NAO as they slipped out of the

handle.

Block of foam material were too soft and therefore the joints clenched the foam

block when grabbing them and the blocks snapped out as NAO was about to

open its hand.

Blocks of styrofoam material seem to fit perfect for NAO as they can be hold

cause to the rough surface and the low weight.

After the shape and the material were chosen it was about to find a solution how

NAO will be able to recognize and distinguish the blocks.

For this task the landmarks were printed, cut out to the blocks size and for every

block of the human player a landmark was tacked on it.

Actually it does not matter if the same landmark number is used or not as NAO only

recognizes the current block on the top of the tower which is placed by the human

player.

2.3. Goal

As it gets clear the game has no competitive part as games usually do, the significant

and necessary factor is the communication part between NAO and the human co-player.

The game ends successfully if the both players were able to built up the tower by using

all of their blocks.

This in turn depends on how well the interaction strategies have been explored.

The diagram below gives a simple overview of Naos events of perceiving the game

area 2.2.

5


Figure 2.2.: Simple diagram of playing one round

6

3. Human-Robot interaction

3.1. Foundations of HRI

The research on autonomous robots and agents is based on the fundamentals of artifi-

cial intelligence, which is the core discipline in informatics for research and creation of

intelligent systems. Besides the improvement and development of the technical compo-

nents for robots as sensors for perception and effectors for movement, one should also

consider the psychological aspect. Therefore artificial intelligence is the key to under-

standing human cognition and implementing this knowledge in a robot, thus enabling

autonomous human-robot interaction. Russell and Norvigs Artificial Intelligence

[1], is the standard reference that briefly documents the field of artificial intelligence

and the correlated subfields as intelligent agents and robotics.

3.1.1. Robotics

According to the definition of Russell and Norvig:

Robots are physical agents that perform tasks by manipulating the physical world.

[1, 971]

Sensors and Effectors

A robot can be equipped with a variety of sensors that allow him to perceive his

environment. [1, 973-975] Cameras evaluate visual stimuli of a certain environ-

ment e.g. detect objects, movements or providing coordinate information of the

environment for calculation of distances to a certain subject.

Microphones scan the robots environment for any acoustical input such as speech

command which triggers a specified action, e.g. greeting the user as he says

hello.

Through the tactile sensors a robot is able to evaluate physical contact. Through

it the robot recognizes whether any obstacles like a wall hinders a movement

action or a touch by a human that initiates an action.

7


Figure 3.1.: Interaction of a robot with its environment

The effectors of a robot are equated with humans limbs and provide flexibility

of physical movements and therefore a wider action space in which the robot

can manipulate the environment. [1, 975-978] A robot in industrial production

usually does not need any leg effectors since these robots typically are installed

in a fixed position and therefore only able to use their arm and hand effectors.

Robots that are used for exploration tasks clearly need more leg and foot effectors

in order to move and be able to explore the environment more efficiently. Effectors

of a robot are essential for providing a higher level of autonomy and increase the

complexity of realizable implementations. Figure 3.1 shows a simple interaction

sequence of a robot in its environment.

8


Types of robots

The technical construction of robots depends on the field of application and thus on

the tasks a robot has to perform. Russell and Norvig refer to this topic and [1, 971-973]

define the following three categories of robot types

The mobility of Manipulators is limited to a small space within the workplace

they have been firmly fixed to. Their main focus lies on perceiving assigned fixed

environments and responding to certain conditions which trigger the manipula-

tors to perform a particular processing task on objects.

Due to their simple technical construction in comparison to other robots, ma-

nipulators are used in industrial production such as automobile industries or

electronics manufacturing. As manipulators usually execute only a predefined

program sequence for a certain task, they are quite simple and therefore can not

really be called intelligent agents.

The reason for this is straightforward: manipulators are incapable of learning

behavior: the implemented code specifies the condition that will trigger a certain

executable task of the robot, for instance grabbing an object off the conveyor

belt for further processing as soon as the manipulator sensors the particularized

object on the conveyor belt inside its defined workspace.

It can be concluded that manipulators also can not be called autonomous agents

because of the lack of capability to independently gain knowledge of the environ-

ment as the knowledge is given by the programmer itself. [1, 39]

The second category of robot types is defined as mobile robots. In contrast to

manipulators this kind of robots have a much larger action space due to mobility

features as legs or wheels. As a result mobile robots can move and perform

actions autonomously which allows for a much wider application range than the

one of the manipulators.

Possible application areas for mobile robots are using them as transport assistants

either in hospitals for food delivery or containerized cargo. Mobile robots are not

only used in business sector but also as assistants in domestic use, for instance

as vacuum cleaners.

The combination of manipulators and mobile robots is leading to the third cate-

gory: mobile manipulators, of which humanoid robots are part of.

The abilities of mobile manipulators expand to the perception of the environ-

ment and the situational manipulation of it by applying their effectors to reach

requested aim states of the environment, triggered by a certain action.

The crucial factor on this is the higher flexibility of moving the effectors, as there

9


are no fixed workplace sets, thus the robots environment gains a higher complex-

ity. That implies the need of also more complex algorithms in order to ensure

the best possible interaction result opportunity among the robot and the envi-

ronment.

Challenging technical factors in humanoid robot researches are the development

of the best possible degree of motion of robots effectors and the perception and in-

sertion of facial expressions. The tough part is realizing the psychological aspects

which are inevitable for the human-robot interaction since most of interaction be-

tween humans is based on non-verbal cues.

3.1.2. Application area of robotics/human-robot interaction

Robots as service assistants already include many roles and the application area is

continually growing with the technical progress in robotics.

Patrick Lin describes the three task attributes dull, dirty and dangerous as the key

attributes that determine the application area of a robot.

As the key advantage of robots over humans he names the lack of emotional expres-

sions which make some tasks for robots easier to handle than humans. As example

he mentions use of robots as volcano explorers, bomb squads or assistants in difficult

surgeries. (cf. [2, 4])

Some of the covered fields in which humans take advantage of benefits of robot

assistants are:

Autism therapy

There have been many studies with humanoid robots in autism therapy, espe-

cially the NAO robot seems, in contrast to other humanoids, to be more suitable

as interaction partner to children, surely also based on its cute and childlike ap-

pearance.

The team around Syamimi Shamsuddin has published many studies about human-

robot-interaction between NAO and Children with Autism. One of their studies

has focused on human-robot-interaction where NAO teaches emotions to children

with autism as a significant deficit of autistic people is the inability of recognizing

and expressing emotions.

The study demonstrates a high acceptance of the robot with reference to NAOs

human-lookalike body shape. The acceptance of NAO as an equal communica-

tion partner expresses into childrens high level of attention and highly motivated

cooperation towards NAO. (cf. [3])

This study ideally represents the successful integration of humanoid robots, not

only as simple assistants for tasks humans cant or dont want to do but also as

10


assistants in tasks of direct communication with people or as in this case with

children in psychological therapy.

Personal Care and Home help

The most common home service-robots are the Roomba Vacuum Cleaning Robots

by the company iRobot 1 which also produces also other varieties of floor cleaners

as Floor Mopping robots.

Confronted by an ageing population resulting from decreasing fertility rates and

increasing of life expectancy [4] as well as lack of workers in health care sector,

the question arises about care services for elder people.

So there is an increasing tendency in the use of robots as assistants for elderly

people by not only supporting them in their activities but also in monitoring and

maintaining the household in that the person lives. [5]

Such a robot is the nursebot Pearl developed by the Carnegie Mellon University.

Pearl is able to move autonomous and provides many interaction features as

speech recognition and facial detection.

As the interaction with humans is the key feature of the robot the developers

also considered abilities in communication skills. Pearl plans and coordinates any

activities and schedules, for instance for taking medicine and is able to intervene

if any irregularities may occur. (cf. [6])

Military

The maybe most controversial and challenging task area is the use of robots for

military purposes. Basically military robots cover a wide area of fields of ap-

plication, all with the objective to take over tasks which are too dangerous for

humans or to provide safety functions for them, such as unmanned explorations

in dangerous or impassable areas, as bomb squad assistants, monitoring a certain

territory for any enemies and in case attack those.

Despite the obvious advantages of using military robots, heavy failures have

shown the weaknesses of such complex constructions. [7]

The authors Lin, Bekey and Abney refer in their reference book [2, 7] to an in-

cident that happened in 2007, where a semi-autonomous robot cannon fired at

and killed against nine fellow combatants.

This and other similar incidents raise the criticism towards malfunctions of robot,

which must be reliable, especially when it comes to protect civilians or being a

fighting comrade, as the consequences in these cases are much more fatal.

1http://www.irobot.com/us/learn/home/roomba.aspx


11

http://www.irobot.com/us/learn/home/roomba.aspx


3.1.3. Autonomous agents vs. humanoid robots

As defined by Russell and Norvig:

An agent is anything that can be viewed as perceiving its environment through

sensors and acting upon that environment through actuators. [1, 4]

Both, the interaction of humans and a software agent or the interaction with a

physical agent, thus a humanoid robot, have different effects depending on how well or

what kind of reservations they can be performed.

A significant study about the differences between a software agent and a robot

is given by the publication of Shinozawa. [8] The comparison based on an exper-

iment in which a user had to select a color name that was recommended either

by a software agent or a physical robot. To avoid any distortions by humans

subjective, rating the appearance was set up similar for both.

By taking the robots three-dimensional appearance and the software agents two-

dimensional space into consideration, the results of their experiment pointed out

the conformity of dimension between the robot, the agent and their interaction

environment.

The three-dimensional robot had greater influence with its recommendations if

the experiment was set up in a three-dimensional environment and vice versa to

the experiment with the two-dimensional software agent.

In the study of Powers [9] the comparison was based on health interviews between

a test person with a software agent, a robot which was projected on a computer

monitor and a robot being present.

The emphasis of this study was to research the influence of each of the agents on

the behavior and attitude of the user.

The results showed that the social influence of robots on users was higher than

on software agents. The test persons classified the robots as more helpful and

spent more time with them then with software agents.

Otherwise the test persons revealed much less information to the robot located

in the same room as to the software agents and also could remember more details

about the interview when it was performed with software agents rather then

robots.

12


3.2. Components of Human-Robot-Interaction

3.2.1. Intelligence and Consciousness

The principle of human-robot interaction is based on the ability of a robot to establish

a successfully interaction with a human. Therefore the robot should be capable to

recognize non-verbal cues correctly to provide the most natural approach in human-

robot interaction. Accordingly, isntit primarily necessary to be aware of the own

existence in order to have an internal knowledge base of own cognitive processes as

also of the environment?

For humans the cognitive state of subconsciousness provides a significant support in

everyday life. The main challenge for humans mind are the coping strategies to reduce

daily information overload down to the important matters. Subconscious perception

unburdens the senses and the mind from collapsing that would result from processing

all the information on a full conscious state of mind.

In context to human-human interaction usually this would mean: subconscious per-

ception of non-verbal cues that are expressed by the interaction partner. Nevertheless,

those non-verbal cues also get processed and affect the behavior towards the interaction

partner.

Although, it should be noted that the classification of information split into subcon-

sciousness and consciousness, is characterized as an individual matter and depends on

individual experience.

Inevitably that leads into the question of a robotss self consciousness and the role of

consciousness in context with the ability of perceiving non-verbal cues.

In the publication Creation of a Conscious Robot by Junichi Takeno [10] he addresses

in detail the understanding and development of conscious robots.

His work not only covers principles of human consciousness and the approaches in de-

velopment of conscious robot but also inspires to raise fundamental questions regarding

the psychological aspects of consciousness and its implementation in a robot.

Researches on intelligent robots and their affection on human-robot interaction have

shown, that the development of intelligent robots has become important: Humans tend

to accept and emphasize with robots with social skills rather then those without. [11],

[12]

13


3.2.2. Perception and Expression

To be able to interact with its environment a robot needs to gather and process in-

formation from it, thus a robot needs the ability of perception over the application

domain.

This can be realized using different sensors whereby for autonomous robots the focus

is put on the acoustic, visual and tactile perception.

The acoustic perception provides to the robot with the ability to filter, process

and to respond to audio stimuli from its environment but also to record and play

back sounds triggered by a certain task.

Beside the perception of simple stimuli, acoustic cognition must also include the

perception of complex voice signals, such as speech in order to be able to achieve

the most possible authentically communication basis between a humanoid robot

and a human.

Even in a limited way, spoken words can trigger a certain action in the robot or

support the robot to recognize and locate a known interaction partner based on

individual voice attributes, such as loudness and tone.

Through the use of visual-based sensors, such as cameras, a robot is able to vi-

sually perceive its application environment. The accuracy of visual detection of

a robot not only depends on a various attributes of camera qualities, such as res-

olution, color rang and lightning compensation, but also on the robots internal

representation of three rules of Norvig.

The robots internal representation of the environment should not only have a

clear structure that enables him to react to environmental changes in a quick,

efficient way, but also include meaningful and sufficient information as a basis for

decision-making.

Furthermore, the internal representation values should be modeled with regard

to ensuring the consistency of the natural state and also the values that are rep-

resented in the real world.[1, 978]

As the natural environment changes continuously and unforeseen events can cre-

ate problems, the robot has to be able to react in certain time frame, for instance

timely avoiding previously not present obstacles.

The challenging part of robotics surely is the realization of sensors which keep

the work reliable even if characteristics of an environment change constantly, for

example lightning conditions. Siegwart and Nourbakhsh [13, 93-94] deal in detail

14


with attributes that influence sensors performance:

A sensors sensitivity rate specifies the degree to which a change of input values

affects the output values, for instance level of light sensitivity of robots cameras.

An error of a sensor produces inconsistency between the real values of an envi-

ronment and the output values which are measured by the sensor. There are two

kinds of errors, predictable and unpredictable ones.

Predictable errors, or also called systematic are measurable errors, are triggered

by modeled processes.

In contrast to that unpredictable errors or random errors cannot be calculated in

advance, as they occur irregular. These random errors include the color levels of

the camera as well as hue errors concerning the level of brightness and contrast.

Precision defines the level of reproducibility of a sensors input values whereas ac-

curacy measures the level of agreement between the true and recorded values of

a sensors output. Hence a high degree of accuracy is equivalent to low error rates.

3.2.3. Expressions

Another essential part in human-robot communication tasks is the perception of mimic,

gesture and speech expressions. They support us on reinforcement of expressing emo-

tions and thoughts.

When humans communicate, they pass their expressions to their dialogue partner ei-

ther explicit or implicit. The direct path, also called verbal communication, means to

express something by speaking to each other. Non-verbal communication on the other

hand consists mimic and gesture expressions and is therefore not always easily visible

to the dialogue partner.

The realization of non-verbal clues in humanoid robots is therefore challenging, as there

is no explicit content of information available, that could be predefined right away. [14]

3.2.4. Manipulation and Locomotion

Another important aspect of human-robot-interaction is the degree of freedom of

motion of a robot that specifies its physical active participation in human-robot-

interaction.

Motion is the hypernym for both terms, manipulation and locomotion.

15


Manipulator robots are a category of robots that are fixed in a certain workplace

3.1.1 and are just able to move objects from one point to another, usually through

hand joint manipulation. As shown in figure 3.2, the robots position within the de-

fined application environment is fixed, while objects can be moved to various points,

though only within robots workspace area.

Locomotion on the other hand means the ability of a robot to move itself from one

point to any other within a defined environment 3.3.

Movement types are similar to the human ones as, for instance, walking, running 2

or even swimming 3, though NAO robot is limited only to walking.



2http://www.bostondynamics.com/robot_bigdog.html

last viewed: 23.04.20143http://groups.csail.mit.edu/drl/underwater_robotics/amour/amour.html


16

http://www.bostondynamics.com/robot_bigdog.htmlhttp://groups.csail.mit.edu/drl/underwater_robotics/amour/amour.html


Beside technical factors, environmental aspects also have to be taken into consid-

eration during development as a possible source of influence on the functionality of

manipulation and locomotion.

Environmental aspects are, for example, the composition and structure of the ground,

the inclination level or the range of contacts points on the robots action path.(cf.[13,

17])

There are different types of motion mechanisms. NASA Mars Rover 4 or the four

legged-robots as Sonys AIBO 5 are two examples. However, this paper will focus on

two-legged robos, such as the NAO, and their ability to keep balance.

Especially humanoid service-robots in housekeeping must be able to master more dif-

ficult motion sequences, such as walking down and up the stairs without loosing their

balance. Hondas ASIMO 6 is an example for successful movement implementation of

two-legged robots.

3.3. State of Art

The focus of researches on improvements in human-robot-interaction field is targeted

at communication skills of robots, as they are intended to interact with humans on

a much more complex level of cognition. As a consequence a robot should have the

skills of being conscious about its own behaviorism as well as its interaction partners

behaviorism. Do do so, robot needs to process and assign the non-verbal clues in a

logically manner so it can act or react in reasonable ways.

Many studies have been investigating techniques in human-robot interaction by using

non-verbal cues. [15], [16]

Bakker and Kuniyoshi [17] as also the team Chen Yu and Dana H. Ballard [18]

described in their publications, approaches of robots learning to recognize humans be-

haviorism.

One method is the explicit implementation of default behavior modes, the other one is

by reinforcement learning.

In addition to those methods they introduce a third learning method: its based on

learning by imitating human behavior.

The robot learns, actually it could be said like a child, by imitating behaviors that are

shown by humans.

4http://marsrover.nasa.gov/home/index.html

last viewed: 23.04.20145http://www.sony-aibo.co.uk/

last viewed: 23.04.20146http://world.honda.com/ASIMO/technology/2011/physical/index.html


17

http://marsrover.nasa.gov/home/index.htmlhttp://www.sony-aibo.co.uk/http://world.honda.com/ASIMO/technology/2011/physical/index.html


3.4. Goal of HRI

The aim in human-robot-interaction research area certainly is to achieve a successful

communication between humanoid robots and humans on a much more complex level

of cognition on the part of the robots.

One objective is to improve or invent solution implementations for representation of

interaction modes on robots considering robots ability on perception and expression of

verbal and non-verbal cues.

With the current state-of-the-art in human-robot-interaction humanoids already used

in therapy tasks as described in 3.1.2. Nevertheless, robots are not fully accepted so-

ciety members at the present as there is yet the absence of progress regarding in outer

appearance of robots and their ability to interact with humans fully autonomously and

consciously.

As a consequence of creating conscious robots, new issues arise regarding the dealing

of ethical issues and secure rights of robots in society thus this should raise awareness

of all resulting social and ethical consequences.

This in turn leads to further questions about how moral thinking can be realized as

well as what or who will be the source that passes on rules which regulate definitions

about right thinking and correct behavior?

What rules shall be the underlying principles for human-robot-interaction consid-

ering cultural and social differences or shall humanoid robots be constructed into the

corresponding culture-specific manner?

Shall robots be able to execute their interaction skills fully autonomously or shall

be humans able to intervene on the robots autonomy and if so, up to what state and

circumstances are humans allowed to take control over the robot?

In sum, human-robot-interaction research should be aimed not only on successful

achievements in communication skills of robots but also deal with the series of ethical

consequences resulting from that.

18

4. The NAO Architecture

4.1. Robot Specifications

NAO is a 58 cm tall humanoid robot developed by the french company Aldebaran

products. The model which was used for this thesis is NAO H25. 1

According to the impact of Naos appearance on its human interaction partner,

there have been studies [19] that confirm how humans attention towards a robot rises

as the robots appearance tends to be human-like. The most significant feature about

Naoappearance surely is its cute, round and childlike face design that makes NAO

being a likeable interaction partner.

Figure 4.1.: NAO Parts - NAO H25

1https://community.aldebaran-robotics.com/doc/1-14/


19

https://community.aldebaran-robotics.com/doc/1-14/


MotionNaos movement can be controlled in various ways as its body is parted in single

joints which allow more specific motion implementation as also the in group of

joints for each body part, depending on the used method. Figure below gives an

overview about the partition of Naos body. 4.2

Figure 4.2.: NAO body parts 2

InteractionNaos interaction modules allow him to interact on a human-like level.

Four microphones and two loudspeakers on his head allow NAO to recognize ei-

ther specific words inside a sentence or recognize and react autonomously on a

complete sentence.

For visual perception NAO is equipped with two VGA cameras that provide a

resolution of 640x480 and with a performance over 30 frames per second.

Naos cameras revealed some issues that complicated the implementation part

of object recognition and will be described in detail in next section 6.3.

Infrared support allows NAO to communicate with any other infrared support-

ing device which means that it is possible to use NAO as a remote controller or

control NAO via remote control.

Further on NAO can connect to other NAO robots and communicate with them.

2https://community.aldebaran-robotics.com/doc/1-14/naoqi/motion/index.html

20


Sensors and BumpersBumpers and tactile sensors on Naos head, chest, hand and feet allow the per-

ception and communication via tactile cues.

These components can be associated with a specific predefined behavior that

gets triggered by touching the sensors. Furthermore NAO is equipped with sonar

rangefinders so he can estimate distance information about objects placed up to

max. 70 cm.

21

5. Software

Figure 5.1.: NAOs software components1

5.1. Embedded Software

NAOqi

The main software of NAO that runs on the robot is NAOqi. It loads at start up

a list of modules with default behavior methods on the robot.2 The figure below 5.3

gives an overview about the structure of the NAOqi process which is called broker when

it runs on the robot. It provides a directory containing all modules and bound methods.

1https://community.aldebaran-robotics.com/doc/1-14/getting_started/software_in_

and_out.html?highlight=camera2https://community.aldebaran-robotics.com/doc/1-14/dev/naoqi/index.html?highlight=

naoqi


22

https://community.aldebaran-robotics.com/doc/1-14/getting_started/software_in_and_out.html?highlight=camerahttps://community.aldebaran-robotics.com/doc/1-14/getting_started/software_in_and_out.html?highlight=camerahttps://community.aldebaran-robotics.com/doc/1-14/dev/naoqi/index.html?highlight=naoqihttps://community.aldebaran-robotics.com/doc/1-14/dev/naoqi/index.html?highlight=naoqi

5. Software

Figure 5.2.: NAOqi Process

5.2. Desktop software

Choregraphe

The desktop software Choregraphe 5.3 offers an easy way to interact and control the

real or a simulated robot and create behaviors in less time that to do it with NAOqi

alone.

It already includes behaviors as predefined template-boxes which can be extended

by own python code, therewith Choregraphe gives the possibility to create complex

behaviors in a convenient way.

An advance is certainly the possibility to use a simulated robot since it is feasible to

test movements on NAO without endangering NAO to fall or damage the joints of his

body if parameters for movement and rotation have been handed on wrong.

Choregraphe also includes the application Monitor, that gives access to furthersettings of Naos memory and the camera module.

The camera module includes a small widget that displays what NAO current

sees. It is also possible to record retrieving images or taking pictures.

The Simulation Helper Tool was very helpful when it was about to test behaviorwithout the real robot and is kindly offered by community members. 3

It comes with a graphical interface which provides a solution by simulating all

NAOqi modules, thus makes possible to test behaviors without necessarily need-

ing the real robot.

3https://community.aldebaran-robotics.com/ by Philippe Capdepuy and Manon Picard, HumaRobotics

23

https://community.aldebaran-robotics.com/

5. Software

Figure 5.3.: Choregraph User Interface

The tool provides all modules needed for speech recognition, simulates a fake

module for visual recognition as well as the sensor modules to simulate tactile

communication options.

5.3. SDK/IDE

The programming with NAO was done by using Python code as Python supports the

access on the robot and is already used in Choregraphes implemented behaviors. The

advances of Python inhere in the minimalistic code structure which results into having

a clear and short code even thought it provides flexibility as it can be combined with

software components of the C++ API.

For structuring, testing and modification of Python modules the IDE PyCharm3 Free

Community Edition 4 was used as it comes with a well defined Code Editor and useful

features that allow to write code in a very comfortable and therefore in a productive

way.

4http://www.jetbrains.com/pycharm/


24

http://www.jetbrains.com/pycharm/

6. Realization of tower building

with NAO

6.1. Implementation

The implementation partially was done in Choregraphe as also direct in Python SDK.

For the initialization part Choregraphe was used, as it provides many predefined tem-

plate boxes which can be used for standard positions and easily can be modified by

import of own Python code modules.

For the part where Nao has to put his block on the top of the other, it was more con-

venient to write and test the code directly in the Python IDE , described in 5.3. The

main advantage was the clearer structure, especially when it comes to identify bugs in

the code.

6.1.1. Object structure in Choregraphe

The figure below 6.1 shows the main object structure of the project. For this part two

flow diagram boxes were used. Flow diagram boxes contain various number of script

boxes and were useful to split the script boxes according to what part of the game they

have to be executed.

Figure 6.1.: Init and Main

25

6. Realization of tower building with NAO

Init flow diagram box

The initialization flow diagram box includes the basics needed before starting the game

at all. The basic modules for the initialization part are:

The first module in the row sets the stiffness for NAOs joints to on. In this case,

the joints can be moved with their full power, which is needed in this case as the main

focus on Naos hardware, is the movement of its arm joints within a small spaced

game area.

The second box triggers NAOs joints to sit down, as it is not needed for him to stand

for this game.

The third box behavior starts, right after and is setting NAO into its initial position.

Actually it modifies the sitting position, so it provides to NAO a wider field during the

game.

Finally NAO invites the human player to join the game.

Figure 6.2.: The Init box including all initial behaviors

Main flow diagram box

This flow diagram box includes the core behaviors implementation of the game.

Thus are spitted into another two flow diagram boxes: One includes the perception

part of NAO and the other one the manipulation method

Figure 6.3.: The Main box contains the core modules

26


Human players Turn

The human player is always considered to start the game. He puts his block, with a

landmark attached on it, on the game field.

This box contains modules for NAOs perception on its co-players actions while on

the human players actions within the game area. At beginning, NAO has to switch to

the bottom camera as its visual component to able to look for the landmark attached

on the block that has been put on the ground.

Thus its NAOs turn to perceive the game area and detect the landmark which implies,

that the human co-player has put his block and finishes the round.

The landmark detection triggers the behavior on NAO to open its left hand and ask

for the block: As soon as the co-player places the block into its hand, NAO waits

for the signal to close his hand and start its turn. The behavior box Tactile L. Hand

passes the signal to NAO to close its hands when the back of the left hand is touched.

Figure 6.4.: NAO perceives human players actions

27


NAOs Turn

In this case, NAOs Turn recognition was realized by landmark detection. The

landmark detection could be implemented quite easy, as the modules for this task

are predefined python modules that were taken from the Aldebaran site 1 and were

modified to the particular requirements of the task.

Figure 6.5.: Landmark Detection triggers Action Grabbing Block

1https://community.aldebaran-robotics.com/doc/1-14/dev/python/examples.html#

python-examples


28

https://community.aldebaran-robotics.com/doc/1-14/dev/python/examples.html##python-examples https://community.aldebaran-robotics.com/doc/1-14/dev/python/examples.html##python-examples


6.2. Improvements

My turn - recognition

The recognition when its his turn should be improved by implementing the

recognition based on non-verbal cues as, eye contact or gesture expressions.

Grabbing a block

Actually NAO does not grab the block by himself, the block is placed in his hand

and by touching the tactile sensors on the hand he closes and holds the block is

given to him.

Placement of a block

That was actually the part that occurred the most difficulties according to Naos

gross joint motion skills.

6.3. Issues with Naos components

During the work with NAO two major issues with Naos hardware have been experi-

enced.

6.3.1. NAO camera

The bad camera quality resulted in difficulties on object recognition, considering the

use of software provided by Aldebaran. Specific problem were the following.

Low camera resolution of 640x480 results in monitoring objects pixelated, thusobjects with many details difficult to be recognized accurately.

High sensitivity to lightning conditions, for example incidents of light cloud covercoming up or soft shadows covering small areas around the block lead to distortion

of recognizing objects including the need to replace the block.

Objects shall be placed within the range of motion of Naos head 6.6, preferablyan object sould be pplaced directly in front the cameras.

2https://community.aldebaran-robotics.com/doc/1-12/nao/hardware/kinematics/

nao-joints-40.html

29

https://community.aldebaran-robotics.com/doc/1-12/nao/hardware/kinematics/nao-joints-40.htmlhttps://community.aldebaran-robotics.com/doc/1-12/nao/hardware/kinematics/nao-joints-40.html


Figure 6.6.: NAO Head joints2

6.3.2. Joints overheating

The fast overheating of Naos hand joints restricted the practical execution of imple-

mented motion sequences.

The part of testing to grab the block and putting it on another could not betested for more that half an hour.

Sometimes NAO issued a warning of overheating after 15 minutes and had to beturned off for quarter-hour to cool down before it could be turned on again.

30

7. Conclusion

The aim of this bachelor thesis was to implement a round-based tower-building game

with the NAO robot in terms of human robot interaction. The tasks were exploring

the best recognition and interaction strategies with the aim to communicate on a non-

verbal level.

The implementation of the game while considering above-mentioned aspects was not

successful. Reasons for that were among other things problems caused by Naos hard-

ware components as the bad resolution quality of the camera as also the gross joint

functionality. It was not possible for NAO to put the block exactly above the other,

reason for that is the inaccuracy that results from the mathematical difference that

occurs when it is about to test code on the real NAO the real robots body even makes

small movements while sitting or just standing, this happening is an attribute to the

motor joints.

Possible improvements can be to implement object recognition via the opencv frame-

work and calculating an error tolerance for the joint movements. So some tasks could

not been completed but by using other frameworks a successful implementation can be

realized.

31

Bibliography

[1] Peter Norvig, co aut., Ernest Davis, and contrib. Artificial intelligence : A Modern

Approach. Pearson, Boston, 3rd ed. edition, 2010.

[2] Lin Patrick, co aut., Abney Keith, A.Bekey George, and contrib. Robot Ethics -

The ethical and social omplications of Robotics. The MIT Press, 2012.

[3] Syamimi Shamsuddin, Hanafiah Yussof, Mohd Azfar Miskam, A Che Hamid, Nor-

jasween Abdul Malik, and Hafizan Hashim. Humanoid robot nao as hri mediator

to teach emotions using game-centered approach for children with autism. In HRI

2013 Workshop on Applications for Emotional Robots, Tokyo, Japan, 2013.

[4] Global health and ageing, 2011.

[5] J. Broekens, M. Heerink, and H. Rosendal. Assistive social robots in elderly care:

a review. Gerontechnology, 8(2), 2009.

[6] Martha E Pollack, Laura Brown, Dirk Colbry, Cheryl Orosz, Bart Peintner, Sailesh

Ramakrishnan, Sandra Engberg, Judith T Matthews, Jacqueline Dunbar-Jacob,

Colleen E McCarthy, et al. Pearl: A mobile robotic assistant for the elderly.

[7] Michael Goodrich and A C Schultz. Human-robot interaction: a survey. Founda-

tions and Trends Human-Computer Interaction, 1:203275, 02/2007 2007.

[8] Kazuhiko Shinozawa, Futoshi Naya, Junji Yamato, and Kiyoshi Kogure. Differ-

ences in effect of robot and screen agent recommendations on human decision-

making. Int. J. Hum.-Comput. Stud., 62(2):267279, 2005.

[9] Aaron Powers, Sara B. Kiesler, Susan R. Fussell, and Cristen Torrey. Comparing

a computer agent with a humanoid robot. In Cynthia Breazeal, Alan C. Schultz,

Terry Fong, and Sara B. Kiesler, editors, HRI, pages 145152. ACM, 2007.

[10] Junichi Takeno. Creation of a Conscious Robot: Mirror Image Cognition and

Self-Awareness. Pan Stanford Publishing, 1st edition, 2012.

[11] Kerstin Dautenhahn. Socially intelligent robots: dimensions of humanrobot in-

teraction. Philosophical Transactions of the Royal Society B: Biological Sciences,

362(1480):679704, 2007.

32

BIBLIOGRAPHY

[12] Allison Bruce, Illah Nourbakhsh, and Reid Simmons. The role of expressiveness

and attention in human-robot interaction, 2002.

[13] Roland Siegwart and Illah R. Nourbakhsh. Introduction to Autonomous Mobile

Robots. Bradford Company, Scituate, MA, USA, 2004.

[14] AnthonyL. Threatt, KeithEvan Green, JohnellO. Brooks, Jessica Merino, IanD.

Walker, and Paul Yanik. Design and evaluation of a nonverbal communication

platform between assistive robots and their users. In Norbert Streitz and Con-

stantine Stephanidis, editors, Distributed, Ambient, and Pervasive Interactions,

volume 8028 of Lecture Notes in Computer Science, pages 505513. Springer Berlin

Heidelberg, 2013.

[15] Jingguang Han, Nick Campbell, Kristiina Jokinen, and Graham Wilcock. G.:

Investigating the use of non-verbal cues in human-robot interaction with a nao

robot. In In: Proceedings of 3rd IEEE International Conference on Cognitive

Infocommunications (CogInfoCom 2012). Kosice, 2012.

[16] Chrystopher L Nehaniv. Classifying types of gesture and inferring intent. In Procs

of the AISB 05 Symposium on Robot Companions, pages 7481. AISB, 2005.

[17] Paul Bakker and Yasuo Kuniyoshi. Robot see, robot do: An overview of robot

imitation. In AISB96 Workshop on Learning in Robots and Animals, pages 311,

1996.

[18] Chen Yu and Dana H. Ballard. Learning to recognize human action sequences.

Development and Learning, International Conference on, 0:28, 2002.

[19] Guido Schillaci, Sasa Bodiroza, and Verena Vanessa Hafner. Evaluating the effect

of saliency detection and attention manipulation in human-robot interaction. I.

J. Social Robotics, 5(1):139152, 2013.

33

Appendix

34

A. Content of CD

Bachelor Thesis as PDF: BAThesisKalpakoulaIoulia.pdf

Folder : Towerbuildingincludingthepythonfiles

Choregraphebehaviors

35

Disclaimer

Ich erklare hiermit gema 17 Abs. 2 APO, dass ich die vorstehende Bachelorarbeit

selbststandig verfasst und keine anderen als die angegebenen Quellen und Hilfsmittel

benutzt habe.

Ort, Datum Unterschrift

AbstractList of FiguresIntroductionMotivationObjectivesProject structure

Building a tower togetherGame strategyObject recognitionGoal

Human-Robot interactionFoundations of HRIRoboticsApplication area of robotics/human-robot interactionAutonomous agents vs. humanoid robots

Components of Human-Robot-InteractionIntelligence and ConsciousnessPerception and ExpressionExpressionsManipulation and Locomotion

State of ArtGoal of HRI

The NAO ArchitectureRobot Specifications

SoftwareEmbedded SoftwareDesktop softwareSDK/IDE

Realization of tower building with NAOImplementationObject structure in Choregraphe

ImprovementsIssues with Naos componentsNAO cameraJoints overheating

ConclusionBibliographyAppendixContent of CD

shall we build a tower together? - uni-bamberg.de · shall we build a tower together? a study of...

Documents