integrating robotic action with biologic perception: integrating robotic action with biologic...

Download INTEGRATING ROBOTIC ACTION WITH BIOLOGIC PERCEPTION: integrating robotic action with biologic perception:

Post on 17-Apr-2020

0 views

Category:

Documents

0 download

Embed Size (px)

TRANSCRIPT

  • 1

    INTEGRATING ROBOTIC ACTION WITH BIOLOGIC PERCEPTION: A BRAIN-MACHINE SYMBYOSIS THEORY

    By

    BABAK MAHMOUDI

    A DISSERTATION PRESENTED TO THE GRADUATE SCHOOL OF THE UNIVERSITY OF FLORIDA IN PARTIAL FULFILLMENT

    OF THE REQUIREMENTS FOR THE DEGREE OF DOCTOR OF PHILOSOPHY

    UNIVERSITY OF FLORIDA

    2010

  • 2

    © 2010 Babak Mahmoudi

  • 3

    To my family who have always inspired me to reach higher goals in my life.

    None of this would have been possible without their unconditional love and support.

  • 4

    ACKNOWLEDGMENTS

    Earning a PhD is all about exploring unknown territories. All along this journey

    many people helped me and because of whom my graduate experience has been one

    that I will cherish forever. I am indebted to all of them, but I will have the chance to

    thank a few here.

    My deepest gratitude is to my advisor Dr. Justin Sanchez who has been both a

    professional mentor and a strong supporter throughout all stages of this adventure.

    Many discussions and long hours with Dr. Sanchez served to elevate this research to a

    higher level. Throughout it all he has developed my abilities as a researcher. Dr. Jose

    Principe got me to understand how a machine could learn – even to think like an

    adaptive filter. All of our hard work enabled a major contribution to the field; I don’t know

    if it would have been possible any other way. Dr. Jeff Kleim’s expertise about the motor

    cortex organization and comments of Dr. Tom DeMarse about learning were helpful in

    BMI design. I would like to thank Dr Van Oostrom and Dr Harris for their support

    especially during the last months of my PhD.

    I owe much of my success to Dr John DiGiovanna who is a great collaborator and

    I have the privilege of calling him one of my best friends. Much of the success of this

    work was due to his contributions. I am grateful to April-Lane Derfyniak and Tifiny

    McDonald in the department of Biomedical Engineering for being always helpful all the

    way from admission to graduation.

    I would like to thank all of my friends especially those members of the

    Computational NeuroEngineering Laboratory who helped me during my research.

    Finally I would like to send my deepest thanks to my family who over the last four years

    supported me from thousands of miles away with their endless love.

  • 5

    TABLE OF CONTENTS

    page

    ACKNOWLEDGMENTS .................................................................................................. 4

    LIST OF FIGURES .......................................................................................................... 8

    LIST OF ABBREVIATIONS ........................................................................................... 11

    ABSTRACT ................................................................................................................... 12

    CHAPTER

    1 INTRODUCTION .................................................................................................... 14

    Brain-Machine Interface (BMI) ................................................................................ 14 Trajectory BMIs ....................................................................................................... 15 Goal-Driven BMIs ................................................................................................... 18 Limitations of the Current BMI Design .................................................................... 19 Brain-Machine Symbiosis (BMS) Theory ................................................................ 21 Organization of the Dissertation .............................................................................. 23

    2 A THEORETICAL FOUNDATION FOR THE BMS THEORY ................................. 25

    Introduction ............................................................................................................. 25 RL Methods for BMS............................................................................................... 29

    Q-Learning ....................................................................................................... 30 Actor-Critic Learning ......................................................................................... 31

    Perception-Action Reward Cycle ............................................................................ 33 Reward Processing in the Brain .............................................................................. 35

    3 MOTOR STATE REPRESENTATION AND PLASTICITY DURING REINFORCEMENT LEARNING BASED BMI ......................................................... 41

    Introduction ............................................................................................................. 41 Reinforcement Learning based BMI ....................................................................... 41 Experiment Setup ................................................................................................... 42 Neuronal Shaping As a Measure of Plasticity in MI ................................................ 46 Neuronal Tuning As a Measure of Robustness in MI States ................................... 48

    4 REPRESENTATION OF REWARD EXPECTATION IN THE NUCLEUS ACCUMBENS AND MODELING OF EVALUATIVE FEEDBACK ........................... 52

    Introduction ............................................................................................................. 52 Experiment Setup ................................................................................................... 52 Temporal Properties of NAcc Activity Leading up to Reward .................................. 55

  • 6

    Extracting an Scalar Reward Predictor from NAcc ................................................. 57

    5 ACTOR-CRITIC REALIZATION OF THE BMS THEORY ....................................... 60

    Introduction ............................................................................................................. 60 BMI Control Architecture ......................................................................................... 60

    Critic Structure .................................................................................................. 64 Actor Structure ................................................................................................. 65

    Closed-Loop Simulator ........................................................................................... 67 Convergence of the Actor-Critic During Environmental Changes ..................... 70 Reorganization of Neural Representation ........................................................ 77 Effect of Noise in the States and the Evaluative Feedback .............................. 81

    6 CLOSED-LOOP IMPLEMENTATION OF THE ACTOR-CRITIC ARCHITECTURE .................................................................................................... 85

    Introduction ............................................................................................................. 85 Experiment Setup ................................................................................................... 85

    Training Paradigm ............................................................................................ 86 Electrophysiology ............................................................................................. 88 Closed-Loop Experiment Paradigm .................................................................. 89

    Critic Learning ......................................................................................................... 91 Neurophysiology of NAcc under Rewarding and Non-Rewarding Conditions .. 91 State Estimation from NAcc Neural Activity ...................................................... 94

    Desired response ....................................................................................... 94 Linear vs. Non-linear regression ................................................................ 97 Classification vs. regression ....................................................................... 98 Time segmentation .................................................................................. 100

    Actor Learning ...................................................................................................... 102 Preliminary Simulations Using Sign and Magnitude of the Evaluative

    Feedback for Training the Actor .................................................................. 103 Inaccuracy in State Estimation and its Influence on the Actor Learning ......... 106 Actor Learning Based on Real MI Neural States and NAcc Evaluative

    Feedback .................................................................................................... 107

    7 CONCLUSIONS ................................................................................................... 113

    Overview ............................................................................................................... 113 Broader Impact and Future Works ........................................................................ 116

    APPENDIX: DUAL MICRO-ARRAY DESIGN ............................................................. 120

    LIST OF REFERENCES .............................................