modeling social influence via combined centralized … · modeling social influence via combined...

217

Modeling Social Influence via Combined Centralized andDistributed Planning Control

James Vaccaro, Staff Research Scientist; Clark Guest, Associate ProfessorLockheed Martin Corporation; University California San Diego

jim. [email protected]; [email protected]

Abstract. Real world events are driven by a mixture of both centralized and distributed control ofindividual agents based on their situational context and intemal make up. For example, some peoplehave partial allegiances to multiple, contradictory authorities, as well as to their own goals and principles.This can create a cognitive dissonance that can be exploited by an appropriately directed psychologicalinfluence operation (PSYOP). An Autonomous Dynamic Planning and Execution (ADP&E) approach isproposed for modeling both the unperturbed context as well as its reaction to various PSYOPinterventions. As an illustrative example, the unrest surrounding the Iranian elections in the summer of2009 is described in terms applicable to an ADP&E modeling approach. Aspects of the ADP&E modelingprocess are discussed to illustrate its application and advantages for this example.

Introduction

We propose using an Autonomous DynamicPlanning and Execution (ADP&E) approach thatintegrates both a centralized and distributedplanning control capability to more realisticallymodel complex social group interactions. In ourrecent survey of implemented models withinsocial science, they do not successfully modelfuture influence operations because they do notintegrate enough cognitive realism in eachautomated-human (agent) to represent real worldconditions and events. This makes the currentmodels unsuitable for large-scale, complexproblem domains. More specifically,implemented models fail to capture severalaspects of human behavior because thesemodels do not include the ability to adjust to verylarge, partially observable, and uncertainenvironments, nor use human abilities indynamic planning to maintain agility in theseever-changing environments.

In addition, many techniques assume acompletely distributed (decentralized) approachthat uses simplified cognitive agents with commongoals to create swarm-like behavior [1]. This leadsto emergent events when the cumulative cognitivestate reaches a tipping point. In the same context,other techniques rely on completely centralizedcontrol of agents to optimize their coordination andlead to more optimal strategies of cooperativeevent behavior, which can suspend reactions ofdiscontent and generate strong unified positions [2].Both of these approaches are goal-directed, but thecentralized approach relies more on reputational orsocial utility, while the distributed approach reliesmore on intrinsic or expressive (i.e., individual orpsychological) utility.

Real world events are actually driven by a mixtureof both centralized and distributed control ofindividuals (agents) based on their situational

context and intemal makeup. Given the level andtype of education, age, interests, experiences,religious affiliation, economic status, etc.,individuals have varying degrees of bothcentralized and distributed behavioral influencesthat either enhances or detracts from their currentenvironmental status or cross-cuts their currentenvironmental circumstances. For example, somepeople may have partial allegiances to multiplecontradictory authorities (e.g., religious vs. science,dictator vs. democracy, etc.), which could create acognitive dissonance within these people.

This further could create an opportunity forchange, given their uncertainty in their future, andtheir willingness to seek change from their currentconditions. Does this form an opportunity forextemal forces to intervene and pursue apsychological influence operation (PSYOP) toredirect the event toward a change beneficial to itsinterests, or does meddling at such a time backfireand strengthen the opposition's claims andperhaps tip the balance in our adversaries favor?An autonomous dynamic planning and execution(ADP&E) framework has been built that includesvariability in searching, selecting, and rewardingplans based on both individual and groupbehaVior. Difficult questions such as this PSYOPmentioned above can be addressed in modelingand simulation if centralized and distributedplanning are successfully integrated within themodel via this ADP&E framework. They will thusbetter model the balance of using both centralizedand distributed planning-influence control andfurther understand its sensitivity throughsimulating interactions among similar and differingsocial groups with differing parameter sets.

Background

Currently implemented cognitive approachescan be analyzed from a game theory perspectiveto determine their problem domain footprint. On

https://ntrs.nasa.gov/search.jsp?R=20100012854 2018-09-08T04:13:40+00:00Z

218

the one hand, reactive planning algorithms, suchas temporal difference reinforcement leamingcan learn two-player stochastic games, such asBackgammon [3]. On the other hand, deepsearch algorithms, such as decision-tree searchusing alpha-beta pruning can plan many movesahead for a two-player deterministic game, suchas chess [4]. However, note that these gamesare both two-player and fully observable, whilethe real-world is many players and partiallyobservable. Further, hybrid solutions have beenproposed to handle more complex real world andgame problems [5]. We propose using a morepowerful hybrid approach that integrates morerealistic features of social interaction byextending an ADP&E approach with both acentralized and distributed planning capability.

An illustrative example will be investigated tobetter model and predict cumulative behavioramongst more cognitively realistic agents basedon their interaction. The analyzed example willbe akin to the situation in regards to the 2009Iranian elections, where there was a rulingfaction and a dissenting faction in conflict. Theruling faction has some centralized authority forcontrol of individuals and the dissenting factionalso has some centralized authority for control ofindividuals. In addition, the individuals havesome intrinsic freedom to choose the centralizedcontrol or act more independently amongthemselves. There are pressures from both sides(rulers or dissenters) and in both directions(centralized and distributed).

We can enhance a current city simulation withsome new features to better realize the behaviorportrayed by the media. A small city has alreadybeen implemented for game playing multi-agentscenarios that includes movement models andline-of-sight. Agents can move based onprescribed waypoints and connections andobserve based on proximity and line-of-sight.Communication connectivity can be added to themodel for simulating the short-range (e.g.,talking, signaling), mid-range (e.g., megaphone,video recording) and long-range (e.g., internet,cell phone) communication channels. The rulingauthority can cut some communication as theydid in Iran, but the dissenting faction can adapttheir behavior by using alternative forms ofcommunication. Also, peaceful and violentbehavior can be exhibited from both sides, andscaling of confrontations can be investigated.However, individuals and group behaviors andcommunications will be limited to both simplifyand exemplify the approach.

A design and implementation strategy has beenstudied on the election defiance scenario in Iran.This paper describes an approach to

implementing such a simulation and describesthe benefits of such a system.

Approach

We describe here a five step approach todesigning, implementing, and demonstrating asocial science simulation to study the causalprecursors that drive the effects in the currentsituation in Tehran, where protests continuesporadically against the conservative regime.

1. A baseline is necessary to allow interactionamong actors. This has been accomplishedusing technologies that form urbanenvironments into game models [6]. Figure 1provides a simple viewpoint of a small citymodel with a variety of connected waypoints(not illustrated).

2. The players of the simulation or game need tobe identified. In the case of the Iraniansituation, eight player types are identified anddescribed.

3. Each player must have enough planning abilityto interact with the other players in a similarenvironment and illustrate realism in thoughtprocesses and ability to reassess and changestrategies. This can be accomplished byintegrating intrinsic-, extrinsic-, andexpressive-utility in each player, and this isdescribed from each player's point of view.These utilities are implemented via a valuefunction that is an integral part of the ADP&Esystem.

4. The interactions must be identified accordingto the current power structure and number ofagents under each authoritarian player. Theinteractions are identified in Figure 2 and eachinteractive link will be described in detail.

5. Each player is identifiable as a planner in anADP&E system, where their plans andperceptions impact all players involvedsimultaneously, and where higher order affectsare plausible and likely. In other words, withineach planner, their parameters dictate theirbehavior and interaction in an attempt tomaximize their own utility, while readjustingtheir plans to counter other planners' activities.Once implemented, parameters can be tuned toillustrate social behavior on a more complexscale.

Step 1: Urban Environmental Game Models

In previous work, an automated technique hasbeen developed to: generate an urban terrainmovement model for computer gaming from aCompact Terrain DataBase (CTDB), increase thesimulation speed of operations to allow muchfaster than real time operations, and aprogramming interface for planning algorithmshas been defined to integrate multiple planners

219

Table 1. Players and Their Utility Metrics

Players\Metncs IntrinSIC Utility ExpreSSive ReputationUtility Utility

largest player in this conflict. This group can bedivided into three camps: the conservatives thatside with the incumbent, the reformists that sidewith the reform party, and the people that want toremain neutral.

As an assumption, some players are consideredas single agent planners, such as the supremeleader, the reform leader, and the religiousclerics. The remaining two planners are therevolutionary guard and the people. Theseplanners require many agents in order to showthe escalation of the conflict. The proper ratio isnot known but there are over 7 million peopleliving in Tehran and only 125 thousand guards inthe entire country. However, the guards are welltrained and armed. There are more players in theIranian election situation than the ones describedhere, but these eight should be enough tosufficiently simulate the conflict.

Treated AsGod/Can DoLittle Wrong

Adjust toPeople's Needs

Never ShowFearBack ReligiousBeliefs

Defend WomenJDebatelDialogue

Zero TolerancelBlock SomeMedia

Keep ReformMovementAlive

Use Force

Teach ReligiousObedience

Demand Empathize/GainRecountl Reject People's FavorViolence

Demand Others Hard Workinglto Follow Poorer Class

Avoid areas of Maintainconnictl Be Safe Respectl Peace

InstigateProtestslFree Speech

Make PeopleSubservient

SuppressProtests

Ignite ProtestslAvoid Violence

Take Orders

Gain Power

Follow ReligionVerbatim

Follow leaderand keep lowprofile

Believe ReformWill HelpEconomy

Reform Party

SupremeLeader

RevolutionaryGuardReligiousHierarchyConservatives

ReligiousHierarchyReformists

PeopleConservatives

People Neutral

PeopleReformists

Step 3: Utility

To appreciate the escalation of the conflict in Iranthree measures of utility can be used for eachplayer: intrinsic, expressive, and reputationalutility. Intrinsic utility is the measure of what thatplayer thinks is important and wants toaccomplish. Expressive utility is the measure ofhow a player will deliver their message.Reputational utility is how the player perceivesother players' opinion of their actions.

These players' metrics are shown in Table 1.This table is a qualitative description of the utilitymetrics. In an implementation, these metricsmust be translated into some quantitative formthat is reflected in their agents' actuators andsensors. For instance, the revolutionary guard'sreputational utility is not to show fear, so they willnever retreat when confronted to maintain fear inthe people.

into the model. An example city model is shownin Figure 1.

Figure 1. Example City Game Board

To better understand the order of magnitude ofthis city model, Figure 1 shows a top-downpicture of the terrain model used. The model is asmall city of approximately 4 km x 5 km. Morespecifically, there are 3649 buildings with over12,000 floor locations. There were over 31,000waypoints generated for this terrain model.

Step 2: Major Game Players

There are five major players in the electionsituation in Iran, where the people are protestingagainst the election results, which appear to bedrastically different than prior polls indicate. Thefive major players in this conflict are: thesupreme leader Ayatollah Ali Khamenei whobacks the government declared incumbentPresident Mahmoud Ahmadinejad, the leadingchallenger Mir Hossein Mousavi, the general incharge of Iran's Revolutionary GuardMohammad Ali Jafari, the religious hierarchy,and the people.

The supreme leader is a 70-year-old cleric. Hereigns over Iran's Islamic system as part pope,part commander in chief and as a one-mansupreme court. President Mahmoud Ahmadinejadwas the winner of the June 12, 2009 election. Heis an ultra-conservative who has isolated Iranfrom the rest of the world through condemnationsof the United States, Israel, and United Nations.The president is backed by the supreme leaderand is a puppet, so he is not considered a playerhere. Mohammad Ali Jafari oversees the 125,000members of Iran's military. This revolutionaryguard (RG) takes direct orders and is consideredthe strong arm of the supreme leader. Thereligious hierarchy is under direction of thesupreme leader as well, but some clerics areasking for reform and a recount of the election.Thus, we have broken this group into two groups,a clerical reform player and a clericalconservative player. The people are by far the

220

simplified representation. The interactions arelabeled one to thirteen with interactions six andseven expanded for the multiple religioushierarchy players and people players,respectively.

Step 4: Interactions

Player interactions are too many to build a realmodel of the Iranian election conflict. However, asimplified interactive model can be created ifassumptions are made. Figure 2 shows such a

--:::=:::--~

Figure 2. Players' Interactions

Connection 1 in Figure 2 is the supreme was the interaction between the people and thecommander contemplating plans to suppress the reform party. They worked together to create largeprotests, his intrinsic utility goal. Connection 2 is peaceful protests that further aggravated thethe supreme leader giving direction to the religious supreme leader. Connection 10 is the mixedhierarchy, especially Ayatollah Ahmad Jannati messages received from the clerics, some sidedMassah who heads Iran's 12-member Guardian with the supreme leader while others demanded aCouncil, which certifies election results and is vote recount or void election. Connection 11closely allied with Khamenei. Connection 3 is the exemplifies the conflict between the protesterslimitations imposed on the reform party by the and the RG. Many people have been killed andsupreme leader. Many times these directions are arrested in this conflict and is triggered by theirignored, such as not attending a religious rally to unwillingness to back down on both sides.honor the dead. Connection 4 is the interaction Connection 12 represents the RG contemplatingbetween the people and the supreme leader. The maneuvers to break up protests, raid reformistssupreme leader demands no protests and many homes, confiscate communication devices, andpeople defy him by attending rallies. Connection 5 detain uncooperative people. Finally, connectionis the supreme leader's use of the revolutionary 13 is the RG's attempt to subdue the reform party,guard (RG) to forcibly take to the streets and such as detaining them from going to rallies.break up protests. Also, the RG acts as an agent, Step 5: ADP&E Systemwhich attempts to cut communication byconfiscating cell phones and detaining people. The proven approach used here has five tiers,Connections 6a-c are the religious hierarchy from the inner cycle of dynamic planning,contemplating plans to either gain power executing, and assessing plans for players and(reformist group) or maintain allegiance to the agents, through the highest level, adaptingsupreme leader (conservative group). players' strategies using tournament playConnections 7a-f are the interactions among the through multiple games. Figure 3 illustrates thispeople. The conflict among the people escalated ADP&E implementation framework.into violence in first few days of protests. This system concept was built from the groundConnection 8 is the reform party contemplating up to be an efficient and modular approach. Thisplans as things unfold. For instance, the reform approach has been already applied for twoparty decided to have large events centered on applications, the game RISK [7), and an urbanhonoring the dead, which appealed to many search and rescue operation [8).people and created large crowds. Connection 9

221

• First, the core cycle was developed as anaction and response system, where individualaction sequences are planned, executed, andassessed in various model environments, withvarying projected expectations, over manycycles, and for all agents in the correct timesequence.

• At the second level, agents execute aparticular plan, and each agent's action set isstored separately for modularity.

• Third, the player is the conceiver andconductor of a plan that encompasses allagent activities. A player has a set ofparameters that determine its choice ofplanned actions, and how often to re-planthose actions.

• Fourth, a game is the domain where actionsequences are executed in the modelenvironments, which will always lead to a finalgoal state. The final goal state must beachievable, because human intervention isprohibited in this framework and a game onlycompletes when the final goal is achieved.

• Fifth, toumaments of games are arranged, sothat players can improve their parametersettings over the course of many tournaments.Through evaluating each player's progress,and modifying the best players' parameters,players can improve their play.

Core Cycle CD1) Planning

2) Execution

3) Assessment

Back to (1)

Figure 3. ADP&E Framework

At the heart of this approach is a core planningcycle for each of the eight players of the game.Figure 3 shows an illustration of this cycle. Thecore cycle has three components: (1) plangenerator (PG); (2) plan-executor (PE); and (3)plan-assessor (PA). The plan-generator isconsidered the search engine for contemplatingplans for each player. PG strings togetherindividual actions to form plans for each agent

based on current perception of situation. Theutility metrics described above can be used toevaluate plans and choose the better ones.Formulations as to how to generate and chooseplans have been examined on two very largeplanning problems and are described in twoprevious papers [7] [8]. The Plan-Executorexecutes the plans in time sequential order. Theplan-assessor estimates how well the remainingplan will execute given new observed informationacquired from the environment while executingthe plan. This cycle can be run after eachexecuted action.

Figure 4. Planning Core Cycle

The three components use three objects that aremanipulated and shared among the components.These three objects are the (1) plans, (2)models, and (3) expectations. Plans aregenerated by PG, executed by PE and assessedby PA. All players can be run in separate threadsand execute independently. City Models areused in PG to predict future states, are used inPE to observe the real states, and are used inPA to observe whether expectations will be met.The models used in PG and PA are virtual-statecity models, which are approximate to the realstate model used in PE. The real-state model is areal-world model, where a plan is executed.Virtual-state models do not know the real statesuntil observed and are initialized to reasonableexpectations. Thus, there are nine perceptions ofthe city model based on which planner is underconsideration. There is one virtual model foreach planner and a real-world model where allplanners can execute their actions. Expectationsare the measure of how well a plan achieves adesired goal (utility metrics), such as breaking upa protest. Expectations are projected both by thegenerated plan in PG and by the plan used inPA. The two expectations are compared to see ifthe expectations projected in PA still meet orexceed the originally generated planexpectations projected in PG. Each agent has anexpectation for its plan. If expectations are met toa prescribed degree, a plan is retained;otherwise a plan is reformulated in PG.

If implemented, such a simulation tool canprovide three major advantages. First, tuningparameters is crucial to matching historical

222

records. The versatility in choosing alternativeactions under uncertainty (e.g., reformist peoplewere younger and more educated, using hightech devices for communications, something theleaders did not consider in initial plans), thetiming of actions! responses (e.g., thegovernment lost credibility when saying theelection was true when they did not use any timeto investigate), the amount of reassessment andreplanning (e.g., people switched to alternativeforms of communication when services were cut,such as twitter, and cell phones) of each theeight players is critical. These are just threeinstances where agile planning is used in realworld social events, and there are many otherareas to investigate. Thus, tuning plannerparameters in key aspects is essential tomatching real world scenarios. The tuning ofparameters can be learned via developedtechniques already established for two otherapplications [7] [8].

The second advantage is the use of an ADP&Esystem to predict how real-time events willunfold. When a model has been developed thataccurately predicts the evolution of historicalevents for a culture as described above, it can betuned to follow the course of current events andcould predict their future development with lessuncertainty. These predictions can be further finetuned to account for shifting alliances andpriorities. Once a baseline of activity has beenestablished, the ability to identify underlyingcauses such as those that lead to unexpectedresults is valuable information in itself.

The third advantage of such a simulation tool isto inject possible outside influences into themodel and see if and how they alter the courseof events. Models such as these could self trainto produce the most desirable effects with thesmallest perturbations. Further, trained modelsmay be examined to determine that observationsof the evolving environment are most useful todetermine that plan expectations are being met.

Summary

This paper has proposed the application ofADP&E to modeling social influence in acombined centralized and distributed context.Individual agents have partial allegiances to oneor more, potentially conflicting, centralauthorities, as well as their own internal goalsand principles. Agents are not simply reactive,but proactively plan and execute actionsequences in these contexts. ADP&E canprovide a means of modeling the social forces atwork within an individual agent, as well as theshifting allegiances and conflicts among agents.Into this complex, dynamic hierarchy, variousPSYOP interventions can be injected, and the

micro and macro reactions of the systemobserved.

The unrest surrounding the Iranian elections inthe summer of 2009 have been used as anillustrative example of ADP&E modeling. Thedefining elements of that situation have beendeconstructed into items and relationshipsprerequisite for the formation of a model.Application of ADP&E to that model has servedto explain the features of ADP&E, and describeits benefits for such social influence models.

References:

1. Eric Bonabeau, Marco Dorigo and GuyTheraulaz. Swann Intelligence: From Naturalto Artificial Systems, 1999

2. Ghallab, Malik; Nau, Dana S.; Traverso,Paolo. Automated Planning: Theory andPractice, Morgan Kaufmann, 2004

3. G. Tesauro, "Programming backgammonusing self-teaching neural nets," ArtificialIntelligence, V. 134, 2002, pp. 181-199.

4. R. Levinson, F. H. Hsu, J. Schaeffe, T. A.Marsland, & D. E. Wilkins, "The role of chessin artificial-intelligence research," ICCAJournal, V. 14, N. 3,1991, pp. 153-161.

5. J. Vaccaro, C. Guest, "Planning an EndgameMove Set for the Game RISK: A Comparisonof Search Algorithms," IEEE Transactions onEvolutionary Computation, Vol. 9, No.6,December 2005, pp. 641-652.

6. J. Vaccaro, C. Guest, "Modeling UrbanTerrain for Simulating Search and RescueOperations to Train Artificial Planners," The1ih lASTED International Conference onApplied Simulation and Modelling (ASM'OB),Corfu, Greece, June, 2008.

7. J. Vaccaro, C. Guest, "Learning· MultipleSearch, Utility and Goal Parameters for theGame RISK," IEEE World Congress onComputational Intelligence (WCCI'06),Vancouver, Canada, July 2006.

8. J. Vaccaro, C. Guest, "Automated DynamicPlanning and Execution for a PartiallyObservable Game Model: Tsunami CitySearch and Rescue," IEEE World Congresson Computational Intelligence (WCCI'06),Hong Kong, China, June 2008.

modeling social influence via combined centralized … · modeling social influence via combined...

Documents