the free-energy principle: a rough guide to the brain? karl friston presented by : gokrna poudel

The free-energy principle: a rough guide to the brain?Karl Friston

Presented by : Gokrna Poudel

Guiding question.Q1: Explain the following terms: KL divergence, entropy, ergodic, free energy, Bayesian surprise, generative model, recognition density, sufficient statistics.

Q2: Explain the free-energy principle of the brain, i.e. the fact that self-organizing biological agents resist a tendency to disorder and therefore minimize the entropy of their sensory states. Give various forms of free energy. Q3: How can action reduce free energy? How can perception reduce free energy? How can active sampling of the sensorium contribute to the free energy reduction? Q4: Explain the neurobiological architecture for implementing the free-energy principle in Figure 1 in Box 1. Describe each of the modules in the figure and their functions as well as the quantities that define the free energy.

2

Guiding question.Q5: Describe the sufficient statistics representing a hierarchical dynamic model of the world in the brain in Figure 1 in Box 2. How are they related with each other? How are the changes in synaptic activity, connectivity, and gain involved with perceptual inference, learning and attention?

Q6: Formulate and describe the neuronal architecture for the hierarchical dynamic model in Figure 1 in Box 3. How are the forward prediction errors computed? How are the backward predictions made? What are the sources of the forward and backward connections in terms of brain anatomy?

Q7: A key implementational issue is how the brain encodes the recognition density. There are two forms of probabilistic neuronal codes: free forms and fixed forms. Give examples of each form and explain them.

Q8: What kinds of optimization schemes does the brain use? Does it use deterministic search on free energy to optimize action and perception? Or, does it use stochastic search? What is your opinion?

3

Information divergence, information gain, cross or relative entropy is a non-commutative measure of the difference between two probability distributions.

In other words KL Divergence is a non-symmetric measure of the difference between two probability distributions . ( )

( || ) ( ) log( )KL

p xD P Q p x dx

q x

KL[Kullback-Leibler] divergence:

4

Ergodic : The process is ergodic if its long term time-average

converges to its ensemble average. Ergodic processes that evolve for a long time forget their initial states.

Entropy: The average surprise of outcomes sampled from a

probability density. A density with low entropy means, on average, the outcome is relatively predictable.

The second law of thermo dynamics states that the entropy of closed systems increases with time. Entropy is a measure of disorder or, more simply, the number of ways the elements of a system can be rearranged.

5

Generative model : It is a forward model and is a probabilistic mapping

from causes to observed consequences (data). It is usually specified in terms of the likelihood of getting some data given their causes (parameters of a model) and priors on the parameters

Recognition density : It is an approximating conditional density is an

approximate probability distribution of the causes of data. It is the product inference or inverting a generative model.

sufficient statistics : quantities which are sufficient to parameterize a

probability density (e.g., mean and covariance of a Gaussian density).

6

Bayesian theory Bayesian probability theory, one of these

“events” is the hypothesis, H, and the other is data, D, and we wish to judge the relative truth of the hypothesis given the data. According to Bayes’ rule, we do this via the relation

Bayesian surprise :A measure of salience based on the divergence

between the recognition and prior densities. It measures the information in the data that can be recognized.

7

Free Energy .It is an attempt to explain the structure

and function of the brain, starting from the fact that exist.

Free-energy is an information theory quantity that bounds the evidence for a model of data.

free-energy is greater than the negative log-evidence or ‘surprise’ in sensory data, given a model of how they were generated.

8

Action , perception and sensorium contribution on free energy reduction

We are open systems in exchange with the environment; the environment acts on us to produce sensory impressions, and we act on the environment to change its states.

On changing the environment or our relationship to it, then sensory input changes. Therefore, action can reduce free-energy by changing the sensory input predicted.

perception reduces free-energy by changing predictions.

we sample the world to ensure our predictions become a self-fulfilling prophecy and surprises are avoided. In this view, perception is enslaved by action to provide veridical predictions that guides active sampling of the sensorium.

9

No. 4Explain the neurobiological

architecture for implementing the free-energy principle in Figure 1 in Box 1. Describe each of the modules in the figure and their functions as well as the quantities that define the free energy.

10

Neurobiological architecture for implementing the free-energy principle

11

Neurobiological architecture for implementing the free-energy principle

Upper panel: schematic detailing the quantities that define free-energy.

Lower panel: alternative expressions for the free-energy that show what its minimization entails. For action, free-energy can only be suppressed by increasing the accuracy of sensory data (i.e. selectively sampling data that are predicted by the representation).

12

No. 5Describe the sufficient statistics

representing a hierarchical dynamic model of the world in the brain in Figure 1 in Box 2. How are they related with each other? How are the changes in synaptic activity, connectivity, and gain involved with perceptual inference, learning and attention?

13

Hierarchical dynamic of the brain

14


Key architecture is the hierarchy.The recognition density is encoded in

terms of its sufficient statistics.On the fig. three sorts of

representations pertaining to the states: {x,v}, parameters: θ and precisions : λ of a hierarchical dynamic model, these are encoded by neural activity, synaptic connectivity and gain respectively. Crucially, the optimization of any one representation depends on the others. 15


The equations associated with this partition represent a gradient descent on free- energy and correspond to

(i) Perceptual inference on states of the world (i.e. optimizing synaptic activity); (ii) Perceptual learning of the parameters underlying causal regularities (i.e. optimizing

synaptic efficacy) and (iii) Attention or optimizing the expected precision of states in the face of random fluctuations and uncertainty (i.e. optimizing synaptic gain). 16

No. 6Formulate and describe the neuronal

architecture for the hierarchical dynamic model in Figure 1 in Box 3. How are the forward prediction errors computed? How are the backward predictions made? What are the sources of the forward and backward connections in terms of brain anatomy?

17

Neuronal architecture for the hierarchical dynamic model

18

Neuronal architecture for the hierarchical dynamic model

Schematic detailing the neuronal architectures that might encode a density on the states of hierarchical dynamic model.

This shows the speculative cells of origin of forward driving connections that convey prediction error from a lower area to a higher area and backward connections that construct predictions .

These predictions try to explain away prediction error in lower levels. In this scheme, the sources of forward and backward connections are superficial and deep pyramidal cells, respectively. 19

No.7 A key implementational issue is how the

brain encodes the recognition density. There are two forms of probabilistic neuronal codes: free forms and fixed forms. Give examples of each form and explain them.

20

Brain encoding recognition density

The free-energy principle induces the recognition density, which has to be represented by its sufficient statistics. It is therefore a given that the brain represents probability distributions over sensory causes .

21

Probabilistic neuronal codes

Free-form and fixed-form:o Free form : particle filtering :the recognition density is represented by the sample density of neuronal ensembles, whose activity encodes the location of particles in state-space.

22

Probabilistic neuronal codesoprobabilistic population code:

Method to represent stimuli by using the joint activities of a number of neurons, each neuron has a distribution of responses over some set of inputs, the responses of many neurons may be combined to determine some value about the inputs.

Probabilistic neuronal codes

o Fix form : multinomial or Gaussian Multinomial forms assume the world is in one of several discrete states and are usually associated with hidden Markov models.

The Gaussian or Laplace assumption allows for continuous and correlated states.

24

No 8What kinds of optimization

schemes does the brain use? Does it use deterministic search on free energy to optimize action and perception? Or, does it use stochastic search? What is your opinion?

25

Optimization schemes by the brainAccording to the free-energy principle, the

sufficient statistics representing will change to minimize free-energy,

It provides principled explanation for perception, memory and attention,

it accounts for perceptual inference (optimization of synaptic activity to encode the states of the environment),

perceptual learning and memory (optimization of synaptic connections that encode contingencies and causal regularities) and

Attention (neuromodulatory optimization of synaptic gain that encodes the precision of states) 26

Optimization schemes by the brainAssumption is that the brain uses a

deterministic gradient descent on free-energy to optimize action and perception.

It might also use stochastic searches; sampling the sensorium randomly for a percept with low free-energy.

Evidence is our eye movements implement an optimal stochastic strategy. This raises interesting questions about the role of stochastic searches from visual search to foraging, in both perception and action 27

Summary

It provides a comprehensive measure of how individual represent and come to sample it adaptively.

It is the goal to minimize the prediction error (suppress Free Energy)

Changes in synaptic activity, connectivity and gain can be understood as perceptual inference, learning and attention.

28

References

Friston. The free-energy principle: a unified brain theory?. Nat Rev Neurosci (2010) vol. 11 (2) pp. 127-38

Friston. The free-energy principle: a rough guide to the brain?. Trends Cogn Sci (Regul Ed) (2009) vol. 13 (7) pp. 293-301

Friston etal. A free energy principle for the brain, Journal of Physiology - Paris 100 (2006) 70–87

29

30

Thank You

the free-energy principle: a rough guide to the brain? karl friston presented by : gokrna poudel

Documents

freeenergy principle

free energy principle

free forms

free energy reduction

various forms of free

probability density

forward model

brain use