Friston Action and Behavior a Free-Energy Formulation

Download Friston Action and Behavior a Free-Energy Formulation

Post on 23-Oct-2014




0 download

Embed Size (px)


Biol CybernDOI 10.1007/s00422-010-0364-zORIGINAL PAPERAction and behavior: a free-energy formulationKarl J. Friston Jean Daunizeau James Kilner Stefan J. KiebelReceived: 30 October 2009 / Accepted: 19 January 2010 The Author(s) 2010. This article is published with open access at Springerlink.comAbstract We have previously tried to explain perceptualinference and learning under a free-energy principle that pur-sues Helmholtzs agenda to understand the brain in terms ofenergy minimization. It is fairly easy to show that makinginferences about the causes of sensory data can be cast asthe minimization of a free-energy bound on the likelihoodof sensory inputs, given an internal model of how they werecaused. In this article, we consider what would happen ifthe data themselves were sampled to minimize this bound.It transpires that the ensuing active sampling or inference ismandated by ergodic arguments based on the very existenceof adaptive agents. Furthermore, it accounts for many aspectsof motor behavior; fromretinal stabilization to goal-seeking.In particular, it suggests that motor control can be under-stood as fullling prior expectations about proprioceptiveThe free-energy principle is an attempt to explain the structure andfunction of the brain, starting from the fact that we exist: This factplaces constraints on our interactions with the world, which have beenstudied for years in evolutionary biology and systems theory. However,recent advances in statistical physics and machine learning point to asimple scheme that enables biological systems to comply with theseconstraints. If one looks at the brain as implementing this scheme(minimizing a free-energy bound on disorder), then many aspects ofits anatomy and physiology start to make sense. In this article, weshow that free-energy can be reduced by selectively sampling sensoryinputs. This leads to adaptive responses and provides a new view ofhow movement control might work in the brain. The main conclusionis that we only need to have expectations about the sensoryconsequences of moving in order to elicit movement. This means wethat can replace the notion of desired movements with expectedmovements and understand action in terms of perceptual expectations.K. J. Friston (B) J. Daunizeau J. Kilner S. J. KiebelThe Wellcome Trust Centre for Neuroimaging, Institute of Neurology,University College London, 12 Queen Square, London,WC1N 3BG, UKe-mail: This formulation can explain why adaptivebehavior emerges in biological agents and suggests a sim-ple alternative to optimal control theory. We illustrate thesepoints using simulations of oculomotor control and thenapply to same principles to cued and goal-directed move-ments. In short, the free-energy formulation may provide analternative perspective on the motor control that places it inan intimate relationship with perception.Keywords Computational Motor Control Bayesian Hierarchical PriorsList of symbols { x, v, , }, Unknown causes of sensory { x, v, , } input; variables in bold denotetrue values and those in italicsdenote variables assumed bythe agent or model x(t ) = [x, x

, x

, . . .]T, Generalised hidden-states that x(t ) = f ( x, v, ) + w act on an agent. These aretime-varying quantities thatinclude all high-order temporalderivatives; they represent apoint in generalised coordi-nates of motion that encodesa path or trajectory v(t ) = [v, v

, v

, . . .]TGeneralised forces or causalstates that act on hiddenstates s(t ) = g( x, v, ) + z Generalised sensory statescaused by hidden states {1, 2, . . .} Parameters of the equa-tions of motion and sen-sory mapping123Biol Cybern {s, x, v} Parameters of the preci-sion of random uctuations(i) : i s, x, v w(t ) = [w, w

, w

, . . .]TGeneralised random uctua-tions of the motion of hid-den states z(t ) = [z, z

, z

, . . .]TGeneralised random uctua-tions of sensory states n(t ) = [n, n

, n

, . . .]TGeneralised random uctua-tions of causal states

i:= (i) = (i)1Precisions or inverse covar-iances of generalised ran-dom uctuationsg( x, v, ), f( x, v, a, ) Sensory mapping and equa-tions of motion generatingsensory statesg( x, v, ), f ( x, v, ) Sensory mapping and equa-tions of motion modelingsensory statesa(t ) Policy: a scalar functionof generalised sensory andinternal statesp( x |m), p( s |m) Ensemble densities; the den-sity of the hidden and sen-sory states of agents at equi-librium with their environ-ment.D(q||p) = _ln(q_p)_q Kullback-Leibler divergenceor cross-entropy betweentwo densities

q Expectation or mean of underthe density qm Model or agent; entailing theform of a generative modelH(X) = ln p( x|m)p Entropy of generalised hid-den and sensory states H(S) = ln p( s|m)pln p( s|m) Surprise or self-informationof generalised sensorystatesF( s, ) ln p( s|m) Free-energy bound on sur-priseq(|) Recognition density oncauses with sufcientstatistics = { (t ), , } Conditional or posterior = { x, v} expectation of the causes; these are the sufcientstatistics of the Gaussianrecognition density (t ) = [,


, . . .]TPrior expectation of general-ised causal statesi = i i : i s, x, v Precision-weighted general-ised prediction errors = s = s g() x = D x f () v = v Generalised prediction erroron sensory states, themotion of hidden states andforces or causal states.1 IntroductionThis article looks at motor control from the point of view ofperception; namely, the tting or inversion of internal mod-els of sensory data by the brain. Critically, the nature of thisinversion lends itself to a relatively simple neural networkimplementationthat shares manyformal similarities withrealcortical hierarchies in the brain. The idea that the brain useshierarchical inference has been established for years (Mum-ford 1992; Rao and Ballard 1998; Friston 2005; Friston etal. 2006) and provides a nice explanation for the hierarchi-cal organization of cortical systems. Critically, hierarchicalinference canbe formulatedas a minimizationof free-energy;where free-energy bounds the surprise inherent in sensorydata, under a model of how those data were caused. Thisleads to the free-energy principle, which says that everythingin the brain should change to minimize free-energy. We willsee below that free-energy can be minimized by changingperceptual representations so that they approximate a poster-ior or conditional densityonthe causes of sensations. Inshort,the free-energy principle entails the Bayesian brain hypoth-esis (Knill and Pouget 2004; Ballard et al. 1983; Dayan et al.1995; Lee and Mumford 2003; Rao and Ballard 1998; Friston2005; Friston and Stephan 2007). However, the free-energyprinciple goes further than this. It suggests that our actionsshould also minimize free-energy (Friston et al. 2006): Weare open systems in exchange with the environment; the envi-ronment acts on us to produce sensory impressions, and weact on the environment to change its states. This exchangerests upon sensory and effector organs (like photoreceptorsand oculomotor muscles). If we change the environment orour relationship to it, then sensory input changes. Therefore,action can reduce free-energy by changing the sensory inputpredicted, while perception reduces free-energy by changingpredictions. In this article, we focus in the implications ofsuppressing free-energy through action or behavior.Traditionally, the optimization of behavior is formulatedas maximizing value or expected reward (Rescorla and Wag-ner 1972; Sutton and Barto 1981). This theme is seenin cognitive psychology, in reinforcement learning models(Rescorla and Wagner 1972); in computational neuroscienceand machine-learning as variants of dynamic programming,such as temporal difference learning (Sutton and Barto 1981;123Biol CybernWatkins and Dayan 1992; Friston et al. 1994; Daw and Doya2006), and in behavioral economics as expected utility theory(Camerer 2003). In computational motor control (Wolpertand Miall 1996; Todorov and Jordan 2002; Todorov 2006;Shadmehr and Krakauer 2008), it appears in the form ofoptimal control theory. In all these treatments, the problemofoptimizing behavior is reduced to optimizing value (or, con-versely, minimizing expected loss or cost). Effectively, thisprescribes an optimal control policy in terms of the value thatwould be expected by pursuing that policy.Our studies suggest that maximizing value may representa slight misdirectioninexplainingadaptive behavior, becausethe same behaviors emerge in agents that minimize free-energy (Friston et al. 2009). In brief, the minimization of free-energy provides a principled basis for understanding bothaction and perception, which replaces the optimal polices ofcontrol theory with prior expectations about the trajectory ofan agents states. In Sect. 2, we reviewthe free-energy princi-ple and active inference. In Sect. 3, we showhowactive infer-ence can be used to model reexive and intentional behavior.This section deals with visual and proprioceptive models todemonstrate the key role of prior expectations in prescribingmovement. In Sect. 4, we consider the integration of visualand proprioceptive signals in nessing the control of cuedreaching movements. Section5 addresses how these priorexpectations could be learned and illustrates the acquisitionof goal-directed movements using the mountain-car problem(Sutton 1996; see also Friston et al. 2009). Section6 revisitsthe learning of priors to prescribe autonomous behavior. Weconclude by discussing the relationship between active infer-ence and conventional


View more >