# Friston Action and Behavior a Free-Energy Formulation

Post on 23-Oct-2014

24 views

Embed Size (px)

TRANSCRIPT

Biol CybernDOI 10.1007/s00422-010-0364-zORIGINAL PAPERAction and behavior: a free-energy formulationKarl J. Friston Jean Daunizeau James Kilner Stefan J. KiebelReceived: 30 October 2009 / Accepted: 19 January 2010 The Author(s) 2010. This article is published with open access at Springerlink.comAbstract We have previously tried to explain perceptualinference and learning under a free-energy principle that pur-sues Helmholtzs agenda to understand the brain in terms ofenergy minimization. It is fairly easy to show that makinginferences about the causes of sensory data can be cast asthe minimization of a free-energy bound on the likelihoodof sensory inputs, given an internal model of how they werecaused. In this article, we consider what would happen ifthe data themselves were sampled to minimize this bound.It transpires that the ensuing active sampling or inference ismandated by ergodic arguments based on the very existenceof adaptive agents. Furthermore, it accounts for many aspectsof motor behavior; fromretinal stabilization to goal-seeking.In particular, it suggests that motor control can be under-stood as fullling prior expectations about proprioceptiveThe free-energy principle is an attempt to explain the structure andfunction of the brain, starting from the fact that we exist: This factplaces constraints on our interactions with the world, which have beenstudied for years in evolutionary biology and systems theory. However,recent advances in statistical physics and machine learning point to asimple scheme that enables biological systems to comply with theseconstraints. If one looks at the brain as implementing this scheme(minimizing a free-energy bound on disorder), then many aspects ofits anatomy and physiology start to make sense. In this article, weshow that free-energy can be reduced by selectively sampling sensoryinputs. This leads to adaptive responses and provides a new view ofhow movement control might work in the brain. The main conclusionis that we only need to have expectations about the sensoryconsequences of moving in order to elicit movement. This means wethat can replace the notion of desired movements with expectedmovements and understand action in terms of perceptual expectations.K. J. Friston (B) J. Daunizeau J. Kilner S. J. KiebelThe Wellcome Trust Centre for Neuroimaging, Institute of Neurology,University College London, 12 Queen Square, London,WC1N 3BG, UKe-mail: k.friston@l.ion.ucl.ac.uksensations. This formulation can explain why adaptivebehavior emerges in biological agents and suggests a sim-ple alternative to optimal control theory. We illustrate thesepoints using simulations of oculomotor control and thenapply to same principles to cued and goal-directed move-ments. In short, the free-energy formulation may provide analternative perspective on the motor control that places it inan intimate relationship with perception.Keywords Computational Motor Control Bayesian Hierarchical PriorsList of symbols { x, v, , }, Unknown causes of sensory { x, v, , } input; variables in bold denotetrue values and those in italicsdenote variables assumed bythe agent or model x(t ) = [x, x

, x

, . . .]T, Generalised hidden-states that x(t ) = f ( x, v, ) + w act on an agent. These aretime-varying quantities thatinclude all high-order temporalderivatives; they represent apoint in generalised coordi-nates of motion that encodesa path or trajectory v(t ) = [v, v

, v

, . . .]TGeneralised forces or causalstates that act on hiddenstates s(t ) = g( x, v, ) + z Generalised sensory statescaused by hidden states {1, 2, . . .} Parameters of the equa-tions of motion and sen-sory mapping123Biol Cybern {s, x, v} Parameters of the preci-sion of random uctuations(i) : i s, x, v w(t ) = [w, w

, w

, . . .]TGeneralised random uctua-tions of the motion of hid-den states z(t ) = [z, z

, z

, . . .]TGeneralised random uctua-tions of sensory states n(t ) = [n, n

, n

, . . .]TGeneralised random uctua-tions of causal states

i:= (i) = (i)1Precisions or inverse covar-iances of generalised ran-dom uctuationsg( x, v, ), f( x, v, a, ) Sensory mapping and equa-tions of motion generatingsensory statesg( x, v, ), f ( x, v, ) Sensory mapping and equa-tions of motion modelingsensory statesa(t ) Policy: a scalar functionof generalised sensory andinternal statesp( x |m), p( s |m) Ensemble densities; the den-sity of the hidden and sen-sory states of agents at equi-librium with their environ-ment.D(q||p) = _ln(q_p)_q Kullback-Leibler divergenceor cross-entropy betweentwo densities

q Expectation or mean of underthe density qm Model or agent; entailing theform of a generative modelH(X) = ln p( x|m)p Entropy of generalised hid-den and sensory states H(S) = ln p( s|m)pln p( s|m) Surprise or self-informationof generalised sensorystatesF( s, ) ln p( s|m) Free-energy bound on sur-priseq(|) Recognition density oncauses with sufcientstatistics = { (t ), , } Conditional or posterior = { x, v} expectation of the causes; these are the sufcientstatistics of the Gaussianrecognition density (t ) = [,

,

, . . .]TPrior expectation of general-ised causal statesi = i i : i s, x, v Precision-weighted general-ised prediction errors = s = s g() x = D x f () v = v Generalised prediction erroron sensory states, themotion of hidden states andforces or causal states.1 IntroductionThis article looks at motor control from the point of view ofperception; namely, the tting or inversion of internal mod-els of sensory data by the brain. Critically, the nature of thisinversion lends itself to a relatively simple neural networkimplementationthat shares manyformal similarities withrealcortical hierarchies in the brain. The idea that the brain useshierarchical inference has been established for years (Mum-ford 1992; Rao and Ballard 1998; Friston 2005; Friston etal. 2006) and provides a nice explanation for the hierarchi-cal organization of cortical systems. Critically, hierarchicalinference canbe formulatedas a minimizationof free-energy;where free-energy bounds the surprise inherent in sensorydata, under a model of how those data were caused. Thisleads to the free-energy principle, which says that everythingin the brain should change to minimize free-energy. We willsee below that free-energy can be minimized by changingperceptual representations so that they approximate a poster-ior or conditional densityonthe causes of sensations. Inshort,the free-energy principle entails the Bayesian brain hypoth-esis (Knill and Pouget 2004; Ballard et al. 1983; Dayan et al.1995; Lee and Mumford 2003; Rao and Ballard 1998; Friston2005; Friston and Stephan 2007). However, the free-energyprinciple goes further than this. It suggests that our actionsshould also minimize free-energy (Friston et al. 2006): Weare open systems in exchange with the environment; the envi-ronment acts on us to produce sensory impressions, and weact on the environment to change its states. This exchangerests upon sensory and effector organs (like photoreceptorsand oculomotor muscles). If we change the environment orour relationship to it, then sensory input changes. Therefore,action can reduce free-energy by changing the sensory inputpredicted, while perception reduces free-energy by changingpredictions. In this article, we focus in the implications ofsuppressing free-energy through action or behavior.Traditionally, the optimization of behavior is formulatedas maximizing value or expected reward (Rescorla and Wag-ner 1972; Sutton and Barto 1981). This theme is seenin cognitive psychology, in reinforcement learning models(Rescorla and Wagner 1972); in computational neuroscienceand machine-learning as variants of dynamic programming,such as temporal difference learning (Sutton and Barto 1981;123Biol CybernWatkins and Dayan 1992; Friston et al. 1994; Daw and Doya2006), and in behavioral economics as expected utility theory(Camerer 2003). In computational motor control (Wolpertand Miall 1996; Todorov and Jordan 2002; Todorov 2006;Shadmehr and Krakauer 2008), it appears in the form ofoptimal control theory. In all these treatments, the problemofoptimizing behavior is reduced to optimizing value (or, con-versely, minimizing expected loss or cost). Effectively, thisprescribes an optimal control policy in terms of the value thatwould be expected by pursuing that policy.Our studies suggest that maximizing value may representa slight misdirectioninexplainingadaptive behavior, becausethe same behaviors emerge in agents that minimize free-energy (Friston et al. 2009). In brief, the minimization of free-energy provides a principled basis for understanding bothaction and perception, which replaces the optimal polices ofcontrol theory with prior expectations about the trajectory ofan agents states. In Sect. 2, we reviewthe free-energy princi-ple and active inference. In Sect. 3, we showhowactive infer-ence can be used to model reexive and intentional behavior.This section deals with visual and proprioceptive models todemonstrate the key role of prior expectations in prescribingmovement. In Sect. 4, we consider the integration of visualand proprioceptive signals in nessing the control of cuedreaching movements. Section5 addresses how these priorexpectations could be learned and illustrates the acquisitionof goal-directed movements using the mountain-car problem(Sutton 1996; see also Friston et al. 2009). Section6 revisitsthe learning of priors to prescribe autonomous behavior. Weconclude by discussing the relationship between active infer-ence and conventional treatments of computational motorcontrol.2 The free-energy principleIn this section, we try to establish the basic motivation forminimizing free-energy. This section rehearses material thatwe have used previously to understand perception. It is pre-sented here with a special focus how action maintains a sta-tionary relationship with the environment; and is developedmore formally than in previous descriptions (e.g., Friston andStephan 2007). The arguments for how perception decreasesfree-energy can be found in the neurobiological (Friston et al.2006; Friston 2008) and technical (Friston et al. 2008) litera-ture. These arguments are reviewed briey but only to a depththat is sufcient to understand the simulations in subsequentsections.What is free-energy? In statistics and machine learning,free-energy is an information theory quantity that bounds theevidence for a model of data (Hinton and von Camp 1993;MacKay 1995; Neal and Hinton 1998). Here, the data aresensory inputs, and the model is encoded by the brain. Moreprecisely, free-energy is greater than the surprise (negativelog-probability) of some data, given a model of how thosedata were generated. In fact, under simplifying assumptions(see below), it is just the amount of prediction error. It iscalled free-energy because of formal similarities with ther-modynamic free-energy in statistical physics; where energiesare just negative log-probabilities (surprise) and free-energyis a boundonsurprise. Inwhat follows, we describe the natureof free-energy, and show why it is minimized by adaptiveagents.We start with the premise that adaptive agents or pheno-types must occupy a limited repertoire of physical states. Fora phenotype to exist, it must possess dening characteristicsor traits; both in terms of its morphology and exchange withthe environment. These traits essentially limit the agent toa bounded region in the space of all states it could be in.Once outside these bounds, it ceases to possess that trait (cf.,a sh out of water). This speaks to self-organized autopoi-etic interactions with the world that ensure these bounds arerespected (cf., Maturana and Varela 1972). Later, we for-malize this notion in terms of the entropy or average sur-prise associated with a probability distribution on the statesan agent experiences. The basic idea is that adaptive agentsmust occupy a compact and bounded part of statespace and,therefore, avoid surprising states (cf., a sh out of watersic). In terms of dynamical system theory, this set of statesis a random attractor (Crauel and Flandoli 1994). Given thisdening attribute of adaptive agents, we will look at howagents might minimize surprise and then consider what thismeans, in terms of their action and perception.The free-energy principle rests on an ensemble densityp( x|m) on generalized states, x(t ) = [x, x

, x

, . . .]T, whichaffect an agent, m. Generalized states cover position, veloc-ity, acceleration, jerk, and so on (Friston 2008). This meansthat states include the position or conguration of the agent,its motion, and all inuences acting on the agent: i.e., phys-ical forces like gravity; thermodynamic states like ambienttemperature, or physiological states such as hypoglycemia.Strictly speaking; the dimensionality of generalized states isinnite because the generalized motion of each state existsto innite order. However, in practice one can ignore high-order temporal derivatives because their precision vanishesand they contain no useful information (i.e., their disper-sion gets very large; see Friston 2008; Friston et al. 2008 fordetails). In the simulations below, we only used generalizedstates up to sixth order. In what follows, x(t ) refers to a statevector and x(t ) denotes the corresponding generalized state(i.e., the state and its generalized motion). Note that the den-sity p( x|m) is conditioned on the agent or model. We will seelater that the model entails formal constraints on the motionof an agents states (i.e., its state-transitions or policy). Thismeans the ensemble density is specic to each class of agent.The ensemble density can be regarded as the probabil-ity of nding an agent in a particular state, when observed123Biol Cybernon multiple occasions or, equivalently, the density of a largeensemble of agents at equilibrium with their environment.Critically, for an agent to exist, the ensemble density shouldhave low entropy. This ensures that agents occupy a lim-ited repertoire of states because a density with low entropyconcentrates its mass in a small subset of statespace (i.e.,its attractor). This places an important constraint on thestates sampled by an agent; it means agents must some-how counter the dispersive effects of random forces, whichincrease entropy. This increase is a consequence of the uc-tuation theorem (Evans 2003), which generalizes the sec-ond law of thermodynamics and says that the probability ofentr...