[ieee 2012 ieee conference on computational intelligence and games (cig) - granada, spain...

8
Personality Profiles for Generating Believable Bot Behaviors Casey Rosenthal and Clare Bates Congdon, Senior Member, IEEE Abstract—In this work, personality profiles are used to de- velop parameterized bot behaviors. While the personality profile structure was originally designed as a descriptive tool for human behavior, as used here it is a generative tool, allowing a plurality of different behaviors to result from a single rule set. This paper describes our use of the Five-Factor Model of personality to develop a bot that plays Unreal Tournament 2004, as an entry in the 2K BotPrize competition at the 2010 IEEE Computational Intelligence and Games conference. Keywords— personality profile, five-factor model, Unreal Tour- nament 2004, 2K BotPrize, rule-based system I. I NTRODUCTION Personality profiles are descriptive tools designed to under- stand and categorize human personalities. For example, the Myers-Briggs Type Indicator [1] is a well known tool that assesses individuals along four dimensions and into 16 broad personality types. In this work, we use the personality profile approach to steer the personalities of bots, so that by adjusting parameters, different bot personalities can be observed with the same core architecture. This paper describes personality profiles, and illustrates their use for believable bot design in both a sales training simulation and to control a bot in a video game. II. PERSONALITY PROFILES Psychological scientists grapple with the problem of cre- ating models that accurately explain human behavior. The current state of psychology offers many models [2] describing human interactions. These models tend to be descriptive, reducing an observed behavior to a synthesis of independent factors. It is difficult to carry out experiments that test the validity of psychological models [3]. The hypothesized factors are abstractions, and generally do not correspond to any physical feature such as brain anatomy. In humans, these factors can never be tested independently, because the only observable result is the synthesis of the behavior, even though it is conceptually necessary in most models for the factors to operate independently of each other. Humans are also moving targets for this line of study because they adapt and change over time. The extent to which a personality remains consistent over time is a subject for debate [4], and many psychological models are criticized for producing inconsistent results [5] when the same human is tested at different times for the same factors. Descriptive models of human behavior are often passed over by software engineers who are modeling behavior because the Casey Rosenthal and Clare Bates Congdon are with the Department of Computer Science, University of Southern Maine, Portland, ME 04104 USA ([email protected] and [email protected]). engineer wants a model that generates behavior. Descriptions of the behavior after the fact are useless to the engineer who wants to create the behavior first. This is especially true of psychological models that require a human to interpret the behavior contextually. To our knowledge, personality profiles have not previously been used in game agent design. A. The Myers-Briggs personality profile The most widely known model of personality profiles is Myers-Briggs. The Myers-Briggs Type Indicator (MBTI) [1] is a questionnaire that isolates four factors in the personality of the subject. These four factors, called “preferences” in the Myers-Briggs framework, constitute the personality profile of that individual. A model then describes the interactions of personalities within teams and to each other based on the factors. The four factors in the Myers-Briggs framework are: ex- troversion/introversion, sensing/intuition, thinking/feeling, and judgement/perception. The value of each of the four factors is binary. Since there are four factors, each with one of two possible values, there are sixteen possible personality profile types in this model. The purpose of a Myers-Briggs Type Indicator or related tool is to determine which of these sixteen possibilities best describes the subject. The Myers-Briggs framework is often implemented in cor- porate environments for the benefit of human resources tasks. The predictions of the model are intended to encourage self- awareness in the subject of his or her interactions with others, as well as inform the corporation’s decision when placing the subject in an setting with other people who have presumably also been profiled by the MBTI. B. The Five-Factor Model Since the early 1990’s, psychologists have converged [6] on one taxonomy for describing personality, the Big Five. The Big Five identifies five personality factors that constitute the Five-Factor Model (FFM) [7]. These five factors are: Extroversion, Agreeableness, Conscientiousness, Neuroticism, and Openness. The labels have varied a little over time and in different contexts, but are generally stable in how they are interpreted. The FFM is an academic framework unencumbered by the intellectual property restrictions that surround the Myers- Briggs framework. This alone may account for the increasing use of the FFM, but proponents also cite research indicating that it has better validity describing a human personality, depicting little variance for the same person over time. Several other frameworks have shown high correspondence between 978-1-4673-1194-6/12/$31.00 ©2012 IEEE 124

Upload: clare-bates

Post on 10-Feb-2017

219 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

Personality Profiles for Generating Believable Bot Behaviors

Casey Rosenthal and Clare Bates Congdon, Senior Member, IEEE

Abstract—In this work, personality profiles are used to de-velop parameterized bot behaviors. While the personality profilestructure was originally designed as a descriptive tool for humanbehavior, as used here it is a generative tool, allowing a pluralityof different behaviors to result from a single rule set. This paperdescribes our use of the Five-Factor Model of personality todevelop a bot that plays Unreal Tournament 2004, as an entryin the 2K BotPrize competition at the 2010 IEEE ComputationalIntelligence and Games conference.

Keywords— personality profile, five-factor model, Unreal Tour-nament 2004, 2K BotPrize, rule-based system

I. INTRODUCTION

Personality profiles are descriptive tools designed to under-

stand and categorize human personalities. For example, the

Myers-Briggs Type Indicator [1] is a well known tool that

assesses individuals along four dimensions and into 16 broad

personality types. In this work, we use the personality profile

approach to steer the personalities of bots, so that by adjusting

parameters, different bot personalities can be observed with

the same core architecture. This paper describes personality

profiles, and illustrates their use for believable bot design in

both a sales training simulation and to control a bot in a video

game.

II. PERSONALITY PROFILES

Psychological scientists grapple with the problem of cre-

ating models that accurately explain human behavior. The

current state of psychology offers many models [2] describing

human interactions. These models tend to be descriptive,

reducing an observed behavior to a synthesis of independent

factors.

It is difficult to carry out experiments that test the validity

of psychological models [3]. The hypothesized factors are

abstractions, and generally do not correspond to any physical

feature such as brain anatomy. In humans, these factors can

never be tested independently, because the only observable

result is the synthesis of the behavior, even though it is

conceptually necessary in most models for the factors to

operate independently of each other.

Humans are also moving targets for this line of study

because they adapt and change over time. The extent to which

a personality remains consistent over time is a subject for

debate [4], and many psychological models are criticized for

producing inconsistent results [5] when the same human is

tested at different times for the same factors.

Descriptive models of human behavior are often passed over

by software engineers who are modeling behavior because the

Casey Rosenthal and Clare Bates Congdon are with the Department ofComputer Science, University of Southern Maine, Portland, ME 04104 USA([email protected] and [email protected]).

engineer wants a model that generates behavior. Descriptions

of the behavior after the fact are useless to the engineer who

wants to create the behavior first. This is especially true of

psychological models that require a human to interpret the

behavior contextually. To our knowledge, personality profiles

have not previously been used in game agent design.

A. The Myers-Briggs personality profile

The most widely known model of personality profiles is

Myers-Briggs. The Myers-Briggs Type Indicator (MBTI) [1]

is a questionnaire that isolates four factors in the personality

of the subject. These four factors, called “preferences” in the

Myers-Briggs framework, constitute the personality profile of

that individual. A model then describes the interactions of

personalities within teams and to each other based on the

factors.

The four factors in the Myers-Briggs framework are: ex-

troversion/introversion, sensing/intuition, thinking/feeling, and

judgement/perception. The value of each of the four factors

is binary. Since there are four factors, each with one of two

possible values, there are sixteen possible personality profile

types in this model. The purpose of a Myers-Briggs Type

Indicator or related tool is to determine which of these sixteen

possibilities best describes the subject.

The Myers-Briggs framework is often implemented in cor-

porate environments for the benefit of human resources tasks.

The predictions of the model are intended to encourage self-

awareness in the subject of his or her interactions with others,

as well as inform the corporation’s decision when placing the

subject in an setting with other people who have presumably

also been profiled by the MBTI.

B. The Five-Factor Model

Since the early 1990’s, psychologists have converged [6]

on one taxonomy for describing personality, the Big Five.

The Big Five identifies five personality factors that constitute

the Five-Factor Model (FFM) [7]. These five factors are:

Extroversion, Agreeableness, Conscientiousness, Neuroticism,

and Openness. The labels have varied a little over time and

in different contexts, but are generally stable in how they are

interpreted.

The FFM is an academic framework unencumbered by

the intellectual property restrictions that surround the Myers-

Briggs framework. This alone may account for the increasing

use of the FFM, but proponents also cite research indicating

that it has better validity describing a human personality,

depicting little variance for the same person over time. Several

other frameworks have shown high correspondence between

978-1-4673-1194-6/12/$31.00 ©2012 IEEE 124

Page 2: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

their factors and the factors within the FFM, indicating that

these frameworks are measuring a real effect.

Instead of the binary selection in Myers-Briggs, the factors

in the FFM can be measured along a sliding scale. This allows

for greater nuance in personality description, and if given an

analog measurement, can result in an infinite representation

of personality types instead of the 16 in the Myers-Briggs

framework. This also means that models built upon the FFM

may require more effort to construct, because they have to

take more potential input values into consideration.

III. USING PERSONALITY PROFILES TO GENERATE BOT

BEHAVIOR

When a software engineer undertakes the task of designing

an algorithm to drive the behavior of a believable bot, he or

she is already in the domain of behavior analysis in that human

behavior is analyzed in order to be reproduced.

This section describes the general approach of personality

profiles to generate bot behavior.

A. Goals for creating believable human-like behavior patterns

Language issues aside, bots are typically distinguished by

behavior that is overly repetitive and overly precise. Both of

these issues must be addressed to create a believable bot.

A bot without any patterns to its behavior would be chaotic,

which is the other extreme from repetitive. This is obviously

not the desired result. To create a believable bot, the behaviors

must be structured but flexible, so that there is not excessive

or unnatural repetition in the behaviors.

A bot that was entirely imprecise might be thought to be

behaving as a young child. This might be believable in some

contexts, but in most instances, we will want our bot to be

plausibly adult, but not superhuman. A degree of imprecision

in behavior is called for for the bot to be plausible.

We have two criteria for our bot. The algorithm generating

the bot behavior must be sufficiently complex that it opaque

to the subject, or else it will betray a pattern which is artificial

or indicative of weaker cognitive ability. It also has to produce

results that are always plausible, or else it will simply seem

inhuman.

One way to skirt predictability is to introduce a stochastic

model. An element of randomness is often useful during devel-

opment of personality profile-based algorithms, but in practice

we have found that random values are not needed in the final

production. The use of FFM-based algorithms as described

below have enough layers and consider enough variables to

quickly exceed a human subject’s ability to accurately predict

the output.

B. Observations on the use of personality profiles for bot

behavior

Personality profile models like the FFM don’t purport to

have any correlation with a physical process or brain anatomy;

however, studies that show validity and that verify the model

encourage the conclusion that something real is being mea-

sured in human behavior. For our purposes, we can ignore

whether or not the FFM accurately models an underlying

system. It is good enough for our purposes that the descriptions

and predictions of FFM are accurate. If the human brain is a

black box to us, this is acceptable since we have the FFM

which is known to us and can produce corresponding results.

C. Strengths in the use of personality profiles for bot behavior

In a rule-based system that attempts to reproduce behavior,

the interaction of variables would quickly become an obstacle

for continued development. A fine line is walked between

having an algorithm complex enough that no human can

simulate it, and yet organized well enough that software

engineers can manipulate it. This is one of the strengths

of using a personality profile as the basis of a behavioral

algorithm. The FFM is the basis for our framework, and it

provides code structure as well as semantic context.

The task then becomes implementing an algorithm based on

the FFM which also satisfies our second criteria for believabil-

ity: that the behavior generated is also plausibly human. Here

again we rely on the construction of the FFM. The reduction

of a personality to five independent factors allows us to load

data into each factor from the domain expert in a manageable

way. We do not have to consider how a human will respond

to a certain stimulus; rather, we only have to consider how the

Extroversion factor will respond to that stimulus, and then the

next factor and so on.

Consider a hypothetical bot that has a high level of Extro-

version factor. We want to model a believable response to a

given stimulus. Rather than treat the behavior of the bot in

its totality, we simply have to decide whether the stimulus

would affect the Extroversion factor into one of the possible

behaviors. Since extroversion implies a preference for being

around others, it is a simple and manageable judgment to

decide whether the stimulus would increase, decrease, or not

affect the likelihood that the bot can choose a behavior that

will allow it to be closer to other characters. If the possible

behaviors are “stay in this room” and “leave this room,” and

the stimulus is another character entering the room, then the

Extroversion factor will encourage the bot to stay in this

room where it has company. The meaning of extroversion is

culturally and contextually loaded, and practically generates

the behavior rule for us. We label the result of this interaction,

the choice of the behavior rule to stay in the room, as an

intention.

There are many ways to use personality profiles such as

the FFM to generate intentions, and many ways to convert

intentions into behaviors. In the algorithms we explore below,

we found it beneficial to generate many intentions and then

use a separate facility to funnel the intentions into a specific

behavior or action.

D. Limitations on the use of a personality profiles for bot

behavior

Personality profiles are not appropriate for all algorithms

where the intent is to create a believable bot. We found that

we are restricted to applications where human personality is

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 125

Page 3: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

visible in actions taken. In a game of tic-tac-toe, for example,

it would be very difficult to distinguish human behavior from

an inhuman algorithm since the available actions are so few.

We have not defined the boundary of when a personality

profile is appropriate and when it is not. We assume that

personality profiles are appropriate for some use scenarios and

not others, and this calculation is left to the architect. We do

have a preference for implementations in strategy or turn-based

games, since we feel these allow more room for abstractions of

behavior, and hence more room for behavior that can be seen

as guided by intention. Scenarios that rely purely on reaction

time or spatial analysis would probably not benefit as much

from a personality profile algorithm, since those functions are

associated with brain systems independent of personality.

In cases where personality profile-based algorithms are

appropriate, using a psychological model for an algorithm is

limited by the fact that these models were created with the

intention of being primarily descriptive and only weakly pre-

dictive. The predicted outcome of putting two known Myers-

Briggs personality types together in a situation may be that

one of the subjects is uncomfortable with the interaction. The

predicted result “uncomfortable” is inherently ambiguous as

is the subject’s assessment of the MBTI as well as the Master

Practitioner. This subjectivity can be uncomfortable for the

software engineer, particularly when designing an algorithm

that must generate behavior.

We use the FFM as a framework to build our bot’s generated

intentions, but we are still susceptible to the criticism that

our interpretation of the FFM is culturally and contextually

dependent. A different team of engineers would most likely

implement different intentions based on different semantic and

experiential interpretations of the five factors. We posit that

this is not critical to our goal of creating a believable bot.

As long as the contextual interpretations of the factors are

consistent, then that internal consistency will provide us with

the incomprehensibility and feasibility that we require in the

aggregate behavior.

E. Evaluating believable bot behaviors

Subjective bias can be mitigated with empirical testing and

refinement in cases where humans and non-human bots can

be swapped. Humans can be tested for their FFM profiles,

and then interact with the program. Profile factors can then be

correlated with actions taken. Bots can then be programmed to

have the same profile factors as the humans, and the algorithm

can be adjusted until the same correlation exists between the

FFM factors and the actions taken.

In cases where personality profile-based algorithms are

appropriate, robust analysis and development leads to a robust

algorithm for generating human-like behavior.

F. Complimentary algorithms to personality profiles

Implementing a personality profile-based algorithm may

not be sufficient to create a believable bot, just as having

a personality is not sufficient to constitute a human. In the

algorithms below, the personality profile-based algorithm acts

as an engine to create intentions. Other engines then convert

the intentions into actions or results. In other cases, the

personality profile-based algorithm can be used as a corollary

to another heuristic, or as a subset of a rules-based engine.

The FFM can represent a personality as five real numbers

corresponding to the five factors. These numbers can be mod-

ified using evolutionary algorithms to generate personalities

that gravitate towards exhibiting the desired behavior. Like-

wise, the intentions or rules generated by an algorithm based

on the FFM can be modified by an evolutionary process to

make the algorithm more realistic, increasing correspondence

with real human personalities and behavior.

In consideration of the overall performance of the bot, the

personality profile-based algorithm can play the foremost role

but is often not sufficient for the entire implementation of

human-like behavior.

IV. AN EXAMPLE OF A STAND-ALONE PERSONALITY

PROFILE ALGORITHM: ALMA

In pursuit of this work, we evaluated ALMA [3], a Java

engine that models the emotional state of an agent based on

the FFM. The ALMA model has three layers: Personality,

Mood, and Emotion. Personality is a constant for a given

agent, and is defined by the five factors of the FFM. Mood is a

medium-term representation of the agent’s state, and Emotions

are short-term state changes. When information comes in from

the environment in the form of an event, action, or object, it

interacts with the Personality to trigger an Emotion, the short-

term state change. Emotions decay quickly, but they affect the

Mood, causing it to migrate over time as an agglomeration

of the emotional past. A separate algorithm can then generate

actions on the Mood state.

ALMA has many positive attributes, not the least of which

is that it is an existing engine that can quickly implement an

algorithm based on the FFM. This assists in algorithm design

and provides a nice separation of responsibilities in the code.

In situations where supporting code can translate a mood into

action, perhaps using a poll on mood that generates behavior

accordingly, ALMA is a good choice; however, the Mood

produced by ALMA can be a limitation in situations where

the bot needs to simultaneously consider multiple inclinations,

and possibly switch between two strategies very quickly. The

Mood in the ALMA engine has momentum that prevents this

type of abrupt switch that is often necessary for human-like

behavior.

Let us imagine a scenario where a timid player in a

first-person shooter game with sub-optimal health points is

suddenly confronted with another character. The timid player

has conflicting inclinations to attack the new character, and to

retreat and find more health points. These inclinations cannot

be carried out simultaneously, so some mechanism has to

perform conflict resolution and choose one strategy. The timid

player’s inclination to attack is greater than his inclination

to retreat, so he chooses to attack. Unfortunately, the return

fire signifies to the timid player that he is outmatched. The

inclination to retreat, which did not diminish, is now greater

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 126

Page 4: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

than his inclination to attack, which also has not diminished.

The timid player now abruptly changes behavior and breaks

off the attack to retreat and find additional health points.

The above illustrates a scenario that we do not believe

ALMA would be sufficient for modeling, but it does go a

long way towards generating a useful framework abstraction.

V. THE SMRTS TRAINING SIMULATION

The Sales Management Research and Training Simulation

(SMRTS) is an online training simulation used to teach sales

management. SMRTS implements a personality profile algo-

rithm based on the FFM that satisfies our criteria of creating

believable bots. This project was informative for developing

our Unreal Tournament 2004 bot, so the work is described

briefly here.

A. SMRTS overview

SMRTS was designed to deploy in a wide variety of training

scenarios, but the most common deployment environment is

a pharmaceutical sales management training session usually

spanning a few days. About twenty people are brought to-

gether who either need management training, have just been

promoted to management from being a sales representative,

or show potential for management are evaluated while playing

the simulation. The cohort is broken up into teams of four or

five people, and each team plays simultaneously in one virtual

world.

The simulation is divided into periods that correspond to

business quarters. Each team plays the role of a district

manager who starts off with five sales representatives under

his or her control. The teams represent different companies

with competitive products competing in the same district. The

sales representatives are of course virtual, and these bots can

be fired or hired from an application pool or even hired away

from other teams.

Every quarter, each team reviews the results from the

previous quarter, makes decisions for the current quarter,

and then the facilitator compiles the results and moves the

simulation into a new quarter. There are approximately 30

decisions that each team can make, and they are the same

decisions each round. Most decisions revolve around where

to invest the district’s effort to gain a strategic advantage; for

example, the team gets to decide as a manager how much

time they spend out in the field with each representative, and

what topic to focus on during that time. The SMRTS decision

screen is illustrated in Figure 1.

Each team gets to decide how much time sales represen-

tatives should spend in the field making new calls versus

catching up on paperwork, and other decisions along these

lines. At the end of each quarter, the districts are ranked against

each other by how much net income they have as well as

market share.

This application requires a sophisticated engine to drive

the virtual sales representative behavior. In this case, the

simulation participants are aware that the representatives are

not real humans, but the bots have to behave in a very realistic

Fig. 1. Decision Screen from SMRTS

fashion. The simulation is a framework for managers to study

and practice management skills like coaching, leadership, and

motivation. The bots have to respond to the manager in a

realistic but complex way. The FFM provided a framework

for developing an algorithm that is sufficiently complex and

realistic.

B. SMRTS details

Each bot in SMRTS represents a sales representative. Each

bot is given a description, resume, career history, and other

interview results typical of what a manager would see for an

applicant who had already been screened by a human resources

department. A person with a background in psychology and

understanding of our goal then defined an FFM profile for

each bot.

The FFM profile is a set of five real numbers between

-1 and 1 for each of the factors; for example, a profile

might be [-0.3, 0.1, 0.7, 0.1, -0.1] where the first number

represents Extroversion, indicating that this personality is more

introverted that the average person. The other four numbers

would represent slightly above average Agreeableness, very

high Conscientiousness, slightly above average Neuroticism,

and slightly below average Openness. We don’t worry about

values being exact, because there is no perfectly quantifiable

correlate in real humans. Instead, we focus on making sure that

we are consistent with our interpretation, since that internal

consistency is the key to having an accurate bot.

The bot is affected by its environment primarily through

the inputs of the decisions made by the manager. One of

the decisions is how much time the manager spends with

the sales representative. The available values of the decision

can be scaled from 0 (no time) to 100 (50 hours or so per

week.) At its simplest, our algorithm needs to translate the

decision through the personality onto some form of output.

We construct a matrix of functions for this translation.

C. Mapping from personalities and situations to moods

Our output contains three values which are averaged for

all of the decisions: Mood, Skill, and Effort. All three outputs

range in value from 0 to 1. We now construct sentences similar

to the following: “If a person spends no time with his or her

boss, what level of Extroversion would maximize his or her

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 127

Page 5: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

Mood at work?” Since Extroversion implies a desire to be with

other people, clearly a person with very low Extroversion is

going have the highest Mood given this decision. In a function

where the decision is on a scale of 0 to 100 on the x-axis, and

Extroversion is on a scale of 0 to 1 on the y-axis, we plot

(0,0) to indicate the maximum Mood value.

Similarly, we can ask the question at the other end of the

scale: “If a person spends 100% of his or time with the boss,

what level of Extroversion would maximize his or her Mood

at work?” We plot (100,1) to indicate that at 100% time we

have a maximum Mood value. If we feel that this relationship

is simple and linear, we can connect the two points with

a straight line and we now have a function that maps the

Extroversion factor through the decision of “time spent with

representative” onto a Mood value. For our purposes, we found

that linear functions were limiting, and so we implemented

Bezier curves to provide a smooth function that allows for

non-linear mapping, as illustrated in Figure 2.

Fig. 2. Example Bezier Mapping of Decision Value onto Mood

Creating a matrix of functions to translate between all of the

factors (5), decisions (30), and outputs (3) can be time consum-

ing. In this example, we have 5 x 30 x 3 = 450 mappings. Some

combinations simply map to 0 and don’t require a function.

The decision “time with representatives” might not have any

bearing on the personality factor Neuroticism mapping to Skill.

Some mappings will then be much easier to generate, and in

practice there will be fewer than the full number of possible

mappings. An example mapping shows the five factors with

the alternative labels Stability [Neuroticism], Extroversion,

Originality [Openness], Accomodation [Agreeableness], and

Consolidation [Conscientiousness] as illustrated in Figure 3.

After the decisions are mapped onto the outputs, another

algorithm takes the outputs and averages them, considers a

decayed momentum for previous values, and considers the

outputs of competing bots in the same district areas as well

as other environmental factors to determine how many units

of product the virtual sales representative sells. Those other

algorithms are not the subject of consideration here, but the

end result is a rich variety of results that demonstrate a feasible

human-like pattern of behavior.

Fig. 3. Example Mapping of Decision to Mood (Motivation) Factors

D. Review of feedback from the humans who played the

simulation

When we ran the simulation with a pilot group of 20 district

managers and experienced sales representatives at a large phar-

maceutical company, we were immediately impressed with

the willingness of humans to anthropomorphize the virtual

sales representatives. The participants obviously knew that the

representatives were not real people. The first round, the teams

would question the algorithm behind the simulation, and try to

get a feel for how sophisticated and nuanced it could be. By

the second round, humans would refer to the bots by name and

begin to project a relationship with them. During conversations

about who to hire and fire, team members could be overheard

saying things along the lines of “I really like Erika [a bot]”

or “Allen [another bot] doesn’t work well with the team; he

has to go.”

By the end of the first day, after playing the simulation for

four rounds, humans were ascribing intentions to the bots. A

notable exclamation might be “Allen [a bot] is driving me nuts;

he’s too stubborn to get the job done.” Since the simulation is

designed to provide a framework for addressing management

issues in a safe environment, this is the perfect opportunity

for a facilitator to step in and dig deeper into the reasons why

the district manager cannot get the sales representative to do

what he wants him to do.

As a program, we found the simulation to be very ef-

fective. As an algorithm, we found the FFM framework to

be instrumental in creating an experience that was complex

enough to be unpredictable, but consistent enough to be

feasible human behavior. Dozens of runs of the simulation

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 128

Page 6: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

at other pharmaceutical companies and in higher education at

the undergraduate, graduate, and Executive MBA level show

similar results to our first pilot.

VI. DISCORDIA AND THE 2010 2K BOTPRIZE

The 2K BotPrize is a competition in the style of a Turing test

whereby computer agents attempt to masquerade as humans

in the multi-player video game Unreal Tournament 2004

(UT2004) [8]. The bot that most often passes for human

according to several judges wins the competition, so creating a

bot that models human behavior most accurately is the highest

priority. We created an agent based on the FFM to compete

as a bot in the 2010 2K BotPrize competition.

The competition takes place inside the game UT2004, which

is a multi-player first-person shooter game. Bots and human

judges play simultaneously, and the humans try to determine

which of the characters they see are bots by shooting them with

a special weapon. All of the bots are ranked by their human-

ness for four rounds. Our bot, Discordia, came in fourth place

among five that made it to the final round of the competition

in 2010.

During the competition, we were able to witness the judge-

ments that the humans made in real time. We were also given

game scripts of each play session that we could play back

later. We are able to play the scripts back from the point of

view of any bot or human agent. From this we are able to

examine the interaction prior to a judgement, and surmise the

clues that identified Discordia as a bot to the human judges.

A. Overview of Discordia and the Hysteria Engine

Several tools were already available to us when we began

designing the bot for the 2K BotPrize 2010 competition.

UT2004 launches a server which coordinates the environment

and player state for a given game. UT2004 clients then connect

to the server to play. In our case, a library called GameBots

connects to the UT2004 server as a client and offers an API for

sending messages to create and control the movements of a

player. Another library called Pogamut (version 3) connects

to GameBots and offers a layer of abstraction for simple

behaviors, like path detection and following other characters,

picking up medical kits, and switching weapons. Discordia

implements a class from the Pogamut library to connect to

the game server and run our bot.

Our bot responds to a very small set of inputs, which map

onto a very small set of outputs, but the interactions in the

mapping lead to a rich complexity of behaviors. As in the

SMRTS example, we set out to use the FFM as a basis for

the algorithm that controls our bot’s behavior. We call this

algorithm the Hysteria Engine. The Hysteria Engine generates

intentions based on the inputs translating through a personality

profile, which are then culled into non-conflicting behaviors.

The behaviors are sent to Pogamut and define the behavior of

the bot.

We divided the inputs into the following environment and

internal state events:

Summary of Inputs for the Discordia Bot

Name Type Description

SeeEnemy environment Has line-of-sight to

another character.

UnderAttack environment Being attacked.

SeeItem environment Has line-of-sight to an

item that can be picked up.

Attacking internal state Firing an attack.

Pursuing internal state Moving toward another

character.

HasWeapon internal state Has a firing weapon.

HighHealth internal state Has above 70 percent

health.

LowHealth internal state Has below 70 percent

health, but above 30 per-

cent.

AlmostDead internal state Has below 30 percent

health.

Pogamut is an event machine algorithm, so our class is

repeatedly called with either new events from the environment

or ongoing events from internal state. We do not consider all

of the events that are possible to intercept from Pogamut or

from GameBots, but this is only because of limited developer

resources. Notice that the inputs are boolean values, the last

three of which are logically exclusive. This simplified our

mapping functions.

B. The Hysteria Engine algorithm

Hysteria Engine maps the events and current state into

many intentions. The intentions undergo conflict resolution

and compete in a separate algorithm and get sent to the game

server. The possible intentions are:

Summary of Intentions for the Discordia Bot

Name Description

Attack Fire the current weapon.

UpgradeWeapon Change weapon to the next-highest

ranked by damage.

StopToThink Stop running.

Dodge Jump while moving.

Crouch Crouch down to make a smaller

target.

Uncrouch Stop crouching to move around

faster.

LookAround Move the head around to see

different objects.

Pursue Move towards another visible

character.

RunAway Move away from an attacking

character.

MedKit Move toward an item to increase

health.

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 129

Page 7: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

Most of these intentions are carried out by Pogamut,

although some require additional information; for example,

Attack requires a parameter defining another character, either

the character currently being attacked or a new one. That

additional information, and the processing of the intentions

are handled outside the Hysteria Engine.

Hysteria Engine processes a personality profile with the five

factors of the FFM: Extroversion, Agreeableness, Conscien-

tiousness, Neuroticism, and Openness. One of the advantages

of using the FFM is that we can quickly generate new bots

with consistently human-like behavior for the small cost of

changing the five values of the personality profile. Each factor

accepts a real number value between -1 and 1. In the case

of Discordia, we chose the following values for our bot’s

personality profile:

Hysteria Default Personality Profile

Factor Value Interpretation

Extroversion 0.2 Outgoing, interested in

interacting with others.

Agreeableness 0.0 Neither cooperative nor

antagonistic.

Conscientiousness 0.1 Slightly goal-oriented, moti-

vated to finish what is started.

Neuroticism 0.0 Neither optimistic nor

pessimistic.

Openness 0.1 Slightly imaginative, willing

to explore.

Notice that these values are close to the origin, 0 being

an “average” value; in this case, we are trying to simulate a

close-to-average human.

1) Generating Intentions: Each one of the nine possible

inputs has five possible mappings for each of the personality

factors, for a total of forty-five possible mappings. Each

mapping may produce any of the twelve intentions to varying

strength, but most of the possible mappings are null, generate

no intentions, and simply return nothing. Depending on the

complexity of the mapping function, building this matrix of

translations can be labor-intensive.

A mapping is best illustrated by an example. Given that

we are creating the mapping between the input SeeEnemy

and Extroversion, we start by considering Extroversion at its

extremes. A personality of -1 Extroversion signifies someone

who does not want to be in the presence of others, so this

combination would generate the strongest RunAway intention.

On the other side of the scale, a personality of +1 Extroversion

wants to be near other people, so this combination would

generate the strongest Pursue action. We could use a simple

linear equation to establish the degree to which these intentions

are generated, with the origin generating neither intention.

Several mappings obviously require no mapping. Consider

the combination of the input SeeItem and the factor Extrover-

sion. No value of Extroversion is going to generate an intention

with regard to seeing an object, so we simply return nothing.

As more scenarios are defined and experienced, we can

continue to modify the mappings in more detail. We can

use smooth equations instead of linear ones, or create Bezier

curves to establish our mappings to an arbitrary level of detail.

In practice we see that even simple linear mappings create a

rich set of intentions and that more detail than this is not

necessary in most cases.

2) Resolving Intentions: As intentions are generated, they

are added to a stack. If the intention already exists in the

stack, the intensity is accumulated. The stack is initialized by

the previous set of decayed intentions. Once the intentions are

generated, we may have several conflicting ones; for example,

StopToThink cannot be fired in tandem with any other action.

We simply choose the intention with the greatest intensity. If

more than one intention has the top value, we choose randomly

among those intentions. Our chosen intention is then emitted

as the action we wish our agent to perform. This management

of intentions is not part of Hysteria Engine, but the personality

profiles do make another appearance in our bot after this

algorithm.

3) Intentional Decay: We don’t want intentions to disap-

pear once they are generated, because humans are creatures

of habit and some are better at staying focused than others.

In order to take into account previous intentions, we apply

a decay function to the stack of generated intentions after

the resolution stage. The decay is a function of the Con-

scientiousness fator. A personality with -1 Conscientiousness

would carry over 0 intensity for each intention, and would

basically start with a clean slate every time. On the other

end of the scale, a personality with +1 Conscientiousness

would carry over nearly all of the intensity of the previous

intentions. This essentially replicates the goal-oriented nature

of the Conscientiousness personality factor. While not part of

Hysteria Engine, it illustrates the usefulness of an internally

consistent psychological model for understanding the behavior

of a bot in its totality.

C. Assessment of strengths and weaknesses of the engine in

this context.

By utilizing the FFM factors as the constituent parts of

the personality profile, creating a new bot without changing

the algorithm is nearly free. We simply change five values.

The 2010 2K Botprize Discordia entry implemented only

one personality profile, but we can imagine scenarios where

personality profiles are randomly generated, perhaps weighted

toward a normal curve centered on zero, and run through

Hysteria Engine to produce interesting and entertaining game

play.

There are disadvantages to the bot as it is described here.

The first is superficial: because UT2004 is a fast-paced action

game, human-like behavior is often recognized in motion.

Hysteria Engine generates intentions, but it leaves the imple-

mentation of behavior to other code or to other classes in

Pogamut. Discordia will always walk down the middle of a

tunnel and try to enter the exact center of a doorway and

fire exactly at the target. The first two can be dealt with by

implementing new path finding algorithms, but the second

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 130

Page 8: [IEEE 2012 IEEE Conference on Computational Intelligence and Games (CIG) - Granada, Spain (2012.09.11-2012.09.14)] 2012 IEEE Conference on Computational Intelligence and Games (CIG)

requires a fuzziness that implies a skill level, which is not

accounted for in the bot as we have described.

On a more fundamental level, Hysteria Engine addresses

strategic, high-level behaviors. Reactionary algorithms that fo-

cus on mechanics and specifically on character movement and

game strategies might be more successful in competitions like

the 2K BotPrize, especially during early development. Hysteria

Engine is certainly capable of interfacing with another engine

that does a better job of implementing the behaviors that it

emits, but building that engine first might be a priority over

building a personality profile-based algorithm in a visually

turbulent action game like UT2004.

Finally, there is the disadvantage that Hysteria Engine re-

quires a human to build the heuristics. The functional mapping

between inputs through personality factors into intentions is

time consuming. An evolved algorithm may save significant

developer resources, and might also allow rules to be generated

on-the-fly to give the bot the appearance of learning.

D. Quantitative Improvements to Hysteria Engine

Once the Hysteria Engine is fleshed out for all ranges of

personality along the five factors, we can easily modify the per-

sonality profile of our agent and tune the results. Because the

FFM originally came from psychology, we should not forget

that it applies to humans as well. This correspondence between

the algorithm and people raises some interesting possibilities.

Our interpretation of the FFM is always going to be subjective.

It is rooted in contextual interpretation of the meaning of

the factors; however, experiments can be constructed to make

our algorithms quantitatively more accurate, as long as our

contextual interpretation remains internally consistent.

Consider the personality profile of Discordia, which we

designed to be very close to an average person. If we want to

qualitatively improve the ability of this particular personality

to generate human-like behavior, we can set up repeatable

experiments to do so. In a computer laboratory setting, we

can recreate the entire competition using subjects of our

choosing, or even hijack (with permission) a LAN party. Using

the results of the experiment, we can adjust the algorithms

to perform better in successive rounds. This method would

primarily improve the aspects of the bot outside of Hysteria

Engine, but some improvements could be made to our mapping

if we discover that intentions are being generated too strongly

in a given circumstance, for example.

In another setup, we can directly improve the mapping to

intentions in Hysteria Engine. A psychologist who understands

our contextual interpretation of the FFM can generate per-

sonality profiles of human players. Those humans then play,

and we record data regarding the actions produced by their

characters during play. That recorded data can be translated

to our outputs, the nine intentions. We can then set our bots

to have a identical personality profiles to the humans, and

compare the actions of the human players to the actions

produced by the Hysteria Engine. A strong correlation between

the two would imply realistic play by our agent; a weak

correlation would implicate areas to improve our engine.

With additional effort, we could put the functional mapping

of inputs through personality profiles into intentions into an

evolutionary algorithm. That algorithm could then evolve the

rule set to bring the behavior of our bots in line with humans

who have the same personality profiles. The validity of our

engine could then be quantitatively tested against a new set

of player data, which is an additional way of testing the

robustness of Hysteria Engine in addition to the 2K BotPrize

competition itself.

VII. CONCLUSIONS

Personality profiles are a logical choice to incorporate

into algorithms that need to generate human-like behavior.

Algorithms that can be evolved and those that can learn have a

natural place in these applications, as well as those that mimic

real anatomical brain features. Personality profiles stand out

for engineers interested in a cross-disciplinary approach that

does not necessarily mimic real human functions but provides

realistic results.

Strategy games are a more natural fit for the FFM algorithms

described here than action games, but we have demonstrated

that it is possible to implement FFM in action game bot

behavior. In our review of the competition scripts, in appears

that signals leading to a judgement against Discordia were

mostly related to spatial interactions between the bot and

its environment, not higher-level intentions that the bot was

attempting to carry out. A combination of personality profiles

and low-level action algorithms may ultimatel provide the

most human-like bot in action games.

ACKNOWLEDGMENTS

Thanks to Dr. Glenn Rosenthal for assistance with the

assessment of SMRTS, and to Philip Hingston for running

the 2K BotPrize Contest.

REFERENCES

[1] I. B. Myers, Introduction to Type: A Description of the Theory and

Applications of the Myers-Briggs Type Indicator. U.S.: Centre forApplications of Psychological Type Inc., 1990.

[2] H. J. Eysenck, “Dimensions of personality: 16, 5 or 3? Criteria for ataxonomic paradigm,” Personality and Individual Differences, vol. 12,pp. 773–790, 1991.

[3] P. Hingston, “Mind games: Psychological warfare between therapists andscientists,” The Chronicle Review, vol. 49, no. 25, p. B7, February 2003.

[4] D. P. McAdams, “The five factor model in personality: A criticalappraisal,” Journal of Personality, vol. 60, pp. 328–361, 1992.

[5] J. McKenzie, “Fundamental flaws in the five factor model: A re-analysisof the seminal correlation matrix from which the openness-to-experiencefactor was extracted,” Personality and Individual Differences, vol. 24, pp.475–480, April 1998.

[6] L. P. N. O. P. John and C. J. Soto, Paradigm Shift to the Integrative

Big-Five Trait Taxonomy: History, Measurement, and Conceptual Issues.

New York, NY: Guilford Press, 2008.[7] J. M. Digman, “Personality structure: Emergence of the five-factor

model,” Annual Review of Psychology, vol. 41, pp. 417–440, 1990.[8] P. Hingston, “A turing test for computer game bots,” IEEE Transactions

on Computational Intelligence and AI In Games, vol. 1, no. 3, pp. 169–186, September 2009.

2012 IEEE Conference on Computational Intelligence and Games (CIG’12) 131