brain regions involved in uscbp reaching models
DESCRIPTION
Brain Regions Involved in USCBP Reaching Models. A High Level Overview. Brain Regions. Cheol’s Models Motor cortex (M1) Spinal cord Basal Ganglia (BG) Dorsal Premotor (PMd, providing input) Jimmy’s Models Parieto-occipital area (V6a) Lateral intraparietal area (LIP) BG - PowerPoint PPT PresentationTRANSCRIPT
Brain Regions Involved in USCBP Reaching Models
A High Level Overview
Brain Regions
• Cheol’s Models– Motor cortex (M1)– Spinal cord– Basal Ganglia (BG)– Dorsal Premotor (PMd, providing input)
• Jimmy’s Models– Parieto-occipital area (V6a)– Lateral intraparietal area (LIP)– BG– PMd (specifically F2)
Issues In Model Integration
• Unified View of M1
• Interactions between PMd and M1
• Role of the BG
• Involvement of the Cerebellum
M1 Modeling
• Cheol– Top-down model – directional tuning with
supervised and unsupervised learning– Bottom-up model – input and output maps
with controlling muscle synergies
• Jimmy– Robotic control model – trajectory generator,
inverse kinematics, PD controllers (probably not all in M1)
Cheol’s Top-Down M1 Model• Directional tuning of M1
neurons tuned using supervised learning and unsupervised learning
• Arm choice learned with reinforcement learning– Jimmy: Equivalent to
noisy WTA based on executability
– Cheol: connecting to unified view of motor learning
Possible motor procedures in the
motor cortex• Inverse dynamics and
muscle models learned using temporal difference learning in an actor-critic architecture
• The actor may correspond to the motor cortex.
TrajectoryGenerator
Inverse Dynamics
Inverse Muscle Model
Motoneurons(spinal cord)
Arm
EvaluatorOf Mvmt
Joint staticLevelPlanning
Joint “force”LevelPlanning
MuscleLevelPlanning
ACTOR CRITIC
TD error
Cheol’s Bottom-up M1 model(based on feedback signal)
Motor CortexModel (map)
Target location(premotor)
Muscle Synergy
ISM
Feedback signal(premotor)
IDM: mapping the error direction to muscle synergy (directly related to directional tuning)
(with optimal feedback controller)
Forward model
+
+
-
Pesaran et al. (2006) indicated that PMd neurons encodedboth target location and feedback signal.
ILGA Motor Controller
• Input - reach target in wrist-centered coordinates
• Dynamic Motor Primitives – generate reach trajectory
• Inverse Kinematics – pseudo-inverse of Jacobian matrix
• PD controllers – one for each DOF
Interactions Between PMd and M1
• Our views of the role of PMd are very similar
• Jimmy– PMd (F2) provides M1 with target location in
wrist-centered coordinates
• Cheol– Supra-motor-cortex coding in PMd may be
feedback error (target location in hand-centered reference frame) and/or target location in the fixation point coordinates.
ILGA: F2 Integrates Bottom-Up and Top-Down Reach Target Signals
• Rostral F2 performs target selection based on parietal and prefrontal input
• Caudal F2 encodes selected target and initiates reach– F6 detects go signal
and disinhibits via BG
Tanne et al (1995)
Reconciliation with FARS view of PMd
• FARS implicated F2 in conditional action selection and F4 in reach target selection
• However many studies show F2 to contain directionally tuned neurons that discharge prior to reaching
• F4 contains bimodal (visual / somatosensory) neurons that respond when objects approach their somatosensory receptive field on the arm or hand
F2 vs. F4: Experimental Data• Neurons in F2 are broadly tuned to multidimensional
direction in a reaching task (Caminiti, 1991; Fu et al., 1993)• Pesaran, Nelson & Andersen (2006) – PMd neurons encode
relative positions of eye, hand, and target– PMd contains combined signals. – MIP contains more (target-eye) coding – fixation point coordinate
• F4 bimodal visual-tactile neurons have very large visual and somatosensory receptive fields and visual field is anchored to somatosensory field– But most don’t fire for stimuli farther than 25cm away (Graziano et al.,
1997) - Not suitable for encoding reach target!– May be involved in feedback control of reach-grasp coordination –
tactile RFs may contribute to transition from visual- to haptic-based control
Role of the BG
• Cheol– Adaptive critic in actor-critic architecture
• Jimmy– Adaptive critic gated by internal state– Action disinhibition
• Role in previous USCBP models– DA / DAJ – action disinhibition– ILGM – reward signal– Extended TD – adaptive critic– Bischoff BG model – next-state prediction
BG Disinhibition of Action
• ILGA’s use of the basal
ganglia to disinhibit
actions is largely
consistent with its role
in the Dominey-Arbib and Dominey-Arbib-Joseph Models
• The cortical target of context-dependent biases are different
BG as an Adaptive Critic• The basal ganglia’s role as an adaptive critic is not very
controversial• However, each of our models uses it to learn different
parameters– Cheol’s top-down model – to modify arm selection– Cheol’s bottom-up model – to learn inverse models– ILGA – kinematic parameters and contextual bias– ACQ – executability and internal state-dependent desirability
• Does this imply several actor/critic combinations (1:1, N:1, 1:N, N:N)?– Cheol’s top-down model – actor / critic– Cheol’s bottom-up model – actor / critic– ILGA – actor / critic– ACQ – actor / multiple critics
M1 & BG roles in Cheol’s unified view
representation of the actuator
Target related signal
Critic(motor-task-related)
Any motoractuators
It represent the current maximum capability of the motor actuators.
So, if the motor actuators are based on muscles, it will be the muscle synergies and the limitation of muscle-based actuator.
If there is a stroke on it, the maximum capability is changed and the limitation of the world increases.
If there is a rehabilitation, the maximum capability is changed again and the limitation decreases.
The reinforcement learning framework will replace “optimization of a task-related cost function” with
“maximization of a task-related reward function” which also accounts for actuators’ limitation
The critic encodes the current task-related reward function. The reward or an action value is defined only when we have an
“objective”.
So, the critic will try to encode which action might be the best action in terms of reward (action value) to achieve a certain objective.
It will monitor that the current movement’s performance. If the performance is changed, the critic will give the information of the
next best action. And it will facilitate changing the actor accordingly.
If there are multiple tasks, there should be multiple critics.
What is now the critic’s role? It will encode the objective function and provide the “teaching” signal to the actor through TD error: if TD
error is zero, we don’t need to change the actor, and so on.
This arrow is the actor.
TD error.
Send limitation of the actuators via unsupervised
learning
Send limitation of the actuators via TD
error
Visual signal(world representation)
Critic X(vision-task-related)
Action-oriented perception ?
M1 & BG roles in Cheol’s unified view
Critic actor Representation of the actuators
Because of the stroke on a motor cortex, we have a change in limitation (performance
change) of the corresponding actuator. The action choice module will encode which arm is
better in a certain direction. So when the performance of the affected arm decreased, it
will say that the best action is using the unaffected arm. (i.e. behavioral compensation).
Can we connect these ideas with the words executability and desirability? In general, the
objective function contains both concepts I think.
Reachingmodule
Graspingmodule
Coordinationmanager
Critic
PLoS model Jool’s variability data
In this coordination problem, we may have an objective of the coordination. As an example, we can weigh more on faster movement, or
on the accurate movement, or accurate grasping.
So based on the different objective, we may have variability in coordination.
However, this coordination is not free from the actuators. First, if there is a signal
dependent noise, we cannot have too fast movement. (This limitation is already in the
Hoff-Arbib model).
Second, too large initial aperture can assure the more accurate grasping but will give a limitation of the reaching module (slower
reaching).
Hierarchical Optimal Feedback ControllerTodorov et al (2005) found a similar idea on
hierarchical optimization of the plants. But the reinforcement learning framework will provide
the more general framework of the motor system learning and may be more applicable
Maybe separated obtaining of those two
modules (early learning)
Motor cortex modelKambara et al. (2008) showed the possibility and
I also would implement it with map reorganization!
Involvement of the Cerebellum
• Schweighofer’s Modeling – corrects for nonlinearities in arm control
• Cheol – what about learning projections from cerebellum to M1?
References• Caminiti, R., Johnson, P.B., Galli, C., Ferraina, S., Burnod, Y. (1991) Making Arm
Movements within Different Parts of Space: The Premotor and Motor Cortical Representation of a Coordinate System for Reaching to Visual Targets. The Journal of Neuroscience, 11(5): 1182-1197.
• Fu, Q.G, Suarez, J.I., Ebner, T.J. (1993) Neuronal Specification of Direction and Distance During Reaching Movements in the Superior Precentral Premotor Area and Primary Motor Cortex of Monkeys. Journal of Neurophysiology, 70(5): 2097-2116.
• Graziano, M.S.A., Hu, X.T., Gross, C.G. (1997) Visuospatial Properties of Ventral Premotor Cortex. Journal of Neurophysiology, 77: 2268-2292.
• Tanne, J., Boussaoud, D., Boyer-Zeller, N., Roiuller, E.M. (1995) Direct visual pathways for reaching movements in the macaque monkey. NeuroReport, 7: 267-272.
• Pesaran, B., Nelson, MJ., Andersen, RA. (2006) Dorsal premotor neurons encode the relative position of the hand, eye, and goal during reach planning. Neuron 51, 125-134
• Buneo, CA., Jarvis, MR., Batista, AP., Andersen RA, (2002) Direct visuo-motor transformation for reaching, Nature 416, 632-636.
• Todorov, E., Li, W., Pan X., (2005) From task parameters to motor synergies: A hierarchical framework for approximately optimal control of redundant manipulator, J Robot Syst. 22(11), 691-710.
• Kambara, H., Kim, K., Shin, D., Sato, M., Koike, Y., (2006) Motor control-learning model for reaching movements, IJCNN2006