dynamic analysis of naive adaptive brain-machine...

48
LETTER Communicated by Liam Paninski Dynamic Analysis of Naive Adaptive Brain-Machine Interfaces Kevin C. Kowalski [email protected] Bryan D. He [email protected] Neural Signal Processing Laboratory, Department of Radiology, University of California, Los Angeles, CA 90095, U.S.A., and Department of Computer Science, California Institute of Technology, Pasadena, CA 91125, U.S.A. Lakshminarayan Srinivasan [email protected] Neural Signal Processing Laboratory, Department of Radiology, University of California, Los Angeles, CA 90095, U.S.A. The closed-loop operation of brain-machine interfaces (BMI) provides a context to discover foundational principles behind human-computer interaction, with emerging clinical applications to stroke, neuromuscular diseases, and trauma. In the canonical BMI, a user controls a prosthetic limb through neural signals that are recorded by electrodes and pro- cessed by a decoder into limb movements. In laboratory demonstrations with able-bodied test subjects, parameters of the decoder are commonly tuned using training data that include neural signals and corresponding overt arm movements. In the application of BMI to paralysis or amputa- tion, arm movements are not feasible, and imagined movements create weaker, partially unrelated patterns of neural activity. BMI training must begin naive, without access to these prototypical methods for parameter initialization used in most laboratory BMI demonstrations. Naive adaptive BMI refer to a class of methods recently introduced to address this problem. We first identify the basic elements of existing approaches based on adaptive filtering and define a decoder, ReFIT-PPF to represent these existing approaches. We then present Joint RSE, a novel approach that logically extends prior approaches. Using recently K.K. and L.S. conceived of and designed the research. K.K. and B.H. performed the experiments. K.K., B.H., and L.S. analyzed the results of the experiments and prepared the figures. K.K. and L.S. drafted the manuscript. K.K., B.H., and L.S. edited and revised the manuscript and approved the final version of the manuscript. L.S. is the corresponding author. Neural Computation 25, 2373–2420 (2013) c 2013 Massachusetts Institute of Technology doi:10.1162/NECO_a_00484

Upload: lamtu

Post on 16-Mar-2018

215 views

Category:

Documents


2 download

TRANSCRIPT

Page 1: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

LETTER Communicated by Liam Paninski

Dynamic Analysis of Naive AdaptiveBrain-Machine Interfaces

Kevin C. [email protected] D. [email protected] Signal Processing Laboratory, Department of Radiology,University of California, Los Angeles, CA 90095, U.S.A., andDepartment of Computer Science, California Instituteof Technology, Pasadena, CA 91125, U.S.A.

Lakshminarayan [email protected] Signal Processing Laboratory, Department of Radiology,University of California, Los Angeles, CA 90095, U.S.A.

The closed-loop operation of brain-machine interfaces (BMI) providesa context to discover foundational principles behind human-computerinteraction, with emerging clinical applications to stroke, neuromusculardiseases, and trauma. In the canonical BMI, a user controls a prostheticlimb through neural signals that are recorded by electrodes and pro-cessed by a decoder into limb movements. In laboratory demonstrationswith able-bodied test subjects, parameters of the decoder are commonlytuned using training data that include neural signals and correspondingovert arm movements. In the application of BMI to paralysis or amputa-tion, arm movements are not feasible, and imagined movements createweaker, partially unrelated patterns of neural activity. BMI training mustbegin naive, without access to these prototypical methods for parameterinitialization used in most laboratory BMI demonstrations.

Naive adaptive BMI refer to a class of methods recently introducedto address this problem. We first identify the basic elements of existingapproaches based on adaptive filtering and define a decoder, ReFIT-PPFto represent these existing approaches. We then present Joint RSE, anovel approach that logically extends prior approaches. Using recently

K.K. and L.S. conceived of and designed the research. K.K. and B.H. performed theexperiments. K.K., B.H., and L.S. analyzed the results of the experiments and preparedthe figures. K.K. and L.S. drafted the manuscript. K.K., B.H., and L.S. edited and revisedthe manuscript and approved the final version of the manuscript. L.S. is the correspondingauthor.

Neural Computation 25, 2373–2420 (2013) c© 2013 Massachusetts Institute of Technologydoi:10.1162/NECO_a_00484

Page 2: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2374 K. Kowalski, B. He, and L. Srinivasan

developed human- and synthetic-subjects closed-loop BMI simulationplatforms, we show that Joint RSE significantly outperforms ReFIT-PPFand nonadaptive (static) decoders. Control experiments demonstrate thecritical role of jointly estimating neural parameters and user intent. Inaddition, we show that nonzero sensorimotor delay in the user signifi-cantly degrades ReFIT-PPF but not Joint RSE, owing to differences in theprior on intended velocity. Paradoxically, substantial differences in thenature of sensory feedback between these methods do not contribute todifferences in performance between Joint RSE and ReFIT-PPF. Instead,BMI performance improvement is driven by machine learning, whichoutpaces rates of human learning in the human-subjects simulation plat-form. In this regime, nuances of error-related feedback to the human userare less relevant to rapid BMI mastery.

1 Introduction

Recent demonstrations illustrate the remarkable ability for electronics tobypass damaged neural circuits, allowing paralyzed and amputee usersto control anthropomorphic robotic limbs and other assistive devices(Hochberg et al., 2012; McFarland, Sarnacki, & Wolpaw, 2010; Schalk et al.,2008). These developments represent the earliest stages of neuroscience andengineering research in brain-machine interfaces (BMI), with applicationsto stroke, trauma, degeneration, and other neuromuscular disease mech-anisms. Significant breakthroughs in the understanding of BMI algorithmdesign are still needed to facilitate the elementary level of performance re-quired for routine activities like eating, bathing, and interacting with lovedones. In this letter, we study the way in which human subjects and neuralsignal processing algorithms learn the basic mapping from neural signals toassistive movement, with the goal of better understanding the sensorimotorand algorithmic basis for this process.

1.1 Definition and Categorization of Naive Adaptive Brain-MachineInterfaces. Many BMI training paradigms involve an initial period of pa-rameter tuning. In this period, parameters of a neural signal model areadjusted to relate observed neural signals with overt movements (San-thanam, Ryu, Yu, Afshar, & Shenoy, 2006; Serruya, Hatsopoulos, Paninski,Fellows, & Donoghue, 2002), or instructed motor imagery (Bradberry, Gen-tili, & Contreras-Vidal, 2011; Hochberg et al., 2006; Kim, Simeral, Hochberg,Donoghue, & Black, 2008). During this initial period, the user is not directlyoperating the BMI.

In contradistinction, naive adaptive control refers to algorithms thatimmediately engage the user in BMI operation during parameter tuning.The word naive, defined previously for the BMI literature (Gage, Ludwig,Otto, Ionides, & Kipke, 2005), indicates that BMI algorithm parameters are

Page 3: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2375

randomized when subjects begin operating the BMI. The word adaptive in-dicates that these parameters are adjusted from this random initializationin an attempt to improve overall BMI performance. The BMI parameterspresented in this letter are the magnitude and preferred direction in cosine-tuned motor cortical neuron point-process models (Truccolo, Eden, Fellows,Donoghue, & Brown, 2005).

Naive adaptive control is a key concept in the analysis and design ofBMI because it addresses four potential barriers to clinical viability. First,actual movements are not available for BMI training in amputation or paral-ysis. Second, instructed motor imagery may generate patterns of neuralactivity that differ from patterns elicited at output neurons during closed-loop BMI control, resulting in performance degradation (Shenoy, Krauledat,Blankertz, Rao, & Muller, 2006; Taylor, Tillery, & Schwartz, 2002). Third,artificial and natural somatosensory feedback may further distort theseobserved neural signal patterns relative to instructed motor imagery, suchas in the difference between sensorimotor potentials evoked during imag-ined versus overt arm movements (Miller et al., 2010) or word repetition(Leuthardt et al., 2012) that drive sensory feedback (touch, pressure, pro-prioception, audition, vision) from the arm, mouth, larynx, eye, and ear.Fourth, learning proficient BMI operation with nonadaptive (static) filtersis slow, requiring weeks to months for basic cursor control alone (Gan-guly & Carmena, 2009; Wolpaw, McFarland, Neat, & Forneris, 1991). Naiveadaptive control could substantially accelerate the learning process for pa-tients. All of these potential barriers to clinical viability are topics of ongoingresearch and active debate.

We divide existing naive adaptive approaches into two groups. Category1 algorithms are based on adaptive filters (Dangi et al., 2011; Gage et al.,2005; Orsborn, Dangi, Moorman, & Carmena, 2012). The user’s goals areexplicitly defined by a training exercise, and these goals are related tothe observed neural activity to infer neural signal parameters. Category 2algorithms are inspired by reinforcement learning. Here, the user’s goal isrepresented implicitly through error signals that are recorded from the brain(Gurel & Mehring, 2012; Mahmoudi & Sanchez, 2011). BMI parameters aretuned to minimize future occurrences of error signals.

Our focus is category 1 naive adaptive BMI. Existing category 1 algo-rithms typically use Kalman filters (Dangi et al., 2011; Gage et al., 2005;Orsborn et al., 2012).

Previously employed in a rat model of reaching movement using audi-tory tones (Gage et al., 2005), the category 1 approach to naive adaptive BMIwas subsequently adopted in a primate model using nonnaive parameterinitialization based on overt arm movements (Gilja et al., 2010, 2012) wherea monkey controls an on-screen cursor with neural activity that relates tothe intended cursor velocity. Related adaptive recursive Bayesian filters, notoriginally described for use in naive adaptive training, were previously de-veloped for tracking neural parameters in BMI and scientific applications

Page 4: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2376 K. Kowalski, B. He, and L. Srinivasan

(Eden, Frank, Barbieri, Solo, & Brown, 2004; Li, O’Doherty, Lebedev, &Nicolelis, 2011; Srinivasan, Eden, Mitter, & Brown, 2007).

In the category 1 method, variously named cursorGoal (Gilja et al., 2010)or ReFIT-KF (Gilja et al., 2012), the monkey is assumed to have perfectknowledge of the current position of the cursor and the on-screen target informing its intended velocity. The ReFIT-KF algorithm rotates its estimateof intended velocity (based on the monkey’s neural activity) toward thetarget when adjusting its parameters. This rotation explicitly assumes thatintentions manifested in motor neural activity reflect zero-effective senso-rimotor delay. Zero-effective delay might approximately occur as a resultof predictive internal models that attempt to compensate intrinsic delays inneural systems (Golub, Yu, & Chase, 2012) or delays in the machine such asalgorithms that bin neural data (Lagang & Srinivasan, 2013).

Variants of the ReFIT-KF approach were subsequently proposed by oth-ers as a solution for naive adaptive BMI in primate (Dangi et al., 2011;Orsborn et al., 2012). While ReFIT-KF adjusted parameters only intermit-tently (Gilja et al., 2010, 2012), these related studies (Dangi et al., 2011;Orsborn et al., 2012) examined variants of ReFIT-KF to study how the fre-quency of parameter updates affected learning. For example, Dangi et al.(2011) applied the ReFIT-KF approach at every time step, which we callcontinuous-ReFIT-KF. In performance comparisons presented in this letter,we use a point-process version of continuous-ReFIT-KF as a representativebenchmark for existing category 1 naive adaptive BMI. We call this bench-mark method ReFIT-PPF, where PPF indicates the use of an approximatediscrete-time point-process filter (Eden et al., 2004) instead of Kalman filter(KF) variants as used by Gilja et al. (2010, 2012) and by Dangi et al. (2011)and Orsborn et al. (2012).

1.2 Contributions of This Letter. We now summarize the contributionsof this letter, which are focused on understanding and improving category 1naive adaptive BMI (Dangi et al., 2011; Gage et al., 2005; Orsborn et al., 2012).We first deconstruct the basic elements of ReFIT variants within a Bayesianframework (see Figure 2) to reveal three implicit design choices made intheir construction. These design choices are the row labels in Figures 2and 3. We then logically extend these design choices to create Joint RSE,a new method for naive adaptive BMI (see Figure 2). We also implementadditional methods (see Figures 2 and 3) that are specifically constructedto probe the relative importance of these design choices in any category1 naive adaptive BMI model system. The analysis demonstrates that JointRSE outperforms ReFIT-PPF in the rate of target acquisition.

To compare these methods, we employ a model system based on healthyhuman volunteers, previously validated in comparison with nonhumanprimate experiments (Cunningham et al., 2011). This model system trans-lates arm movements from the subject into simulated primary motor corticalspiking activity to reproduce closed-loop behavior in moving a cursor to a

Page 5: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2377

target in a two-dimensional on-screen work space (see Figure 1). We alsomodify this model system in two ways. First, we substantially decreasethe cost of implementation by using the Microsoft Kinect 3D camera (seeFigure 1B) for hand tracking (currently US$100) instead of the NorthernDigital Polaris tracking system (currently estimated at US$60,000). Second,we demonstrate that human motor learning can be permitted by initial-izing neural parameters to effect a visuomotor rotation (see Figures 7–9and 10B).

We show that Joint RSE significantly outperforms ReFIT-PPF, randomwalk, and static decoders (see Figures 5A and 8). Control experimentswith human subjects demonstrate that Joint RSE outperforms ReFIT-PPFby jointly estimating neural parameters and user intent (see Figure 5C). Inthat experiment, Lockstep RSE/RSE is constructed as a lockstep versionof Joint RSE to isolate the contribution of joint estimation to performancein Joint RSE. We perform further analysis using a simplified variant ofour recently described stochastic control model of humans in closed-loopBMI (Lagang & Srinivasan, 2013). This analysis suggests that nonzero sen-sorimotor delay in the human subject significantly degrades ReFIT-PPF,while Joint RSE is robust under various levels of delay (see Figure 11). Thismay occur as a result of differences in the prior on intended velocity (seeFigure 2).

Paradoxically, substantial differences in sensory feedback between thesemethods do not contribute to differences between Joint RSE and ReFIT-PPF (see Figure 5B), even under experimental conditions that permit hu-man learning (see Figures 7–9). Further analysis reveals that the timescaleof BMI performance improvement in the naive adaptive methods closelymatches rates of machine learning, where human learning is undetected(see Figures 6B and 6C) or more gradual (see Figures 8B and 8C) in theseexperiments. The relatively slow rate of human learning helps to explainwhy overall BMI performance was insensitive to sensory feedback sent tothe user. In this model system, machine learning far outpaced the human’sability to learn from error signals, regardless of sensory feedback.

For neurophysiologists and clinicians, this work provides testable cate-gory 1 naive adaptive decoders (see Figures 2 and 3), explaining why JointRSE is expected to dominate ReFIT variants in the final clinical application.This letter also explicitly identifies major category 1 design choices like jointestimation, prior on intention, and sensory feedback, offering experimen-tally verifiable predictions on their relative importance, as well as explicitalgorithm formulations to use in experimental testing.

For BMI algorithmists, this work clarifies implicit assumptions of existingcategory 1 naive adaptive BMI. We illustrate a new method, Joint RSE, basedon the logical relaxation of these assumptions. Our analysis contributes to agrowing body of work that seeks to uncover the design principles of naiveadaptive BMI for the benefit of patients limited by stroke, neuromusculardiseases, and trauma.

Page 6: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2378 K. Kowalski, B. He, and L. Srinivasan

2 Methods

2.1 Human-Subjects Closed-Loop BMI Simulator. The bulk of ouranalysis in this letter (see Figures 4–10) is based on studying able-bodiedhuman subjects engaged in operating a closed-loop BMI simulator. Thishuman-subjects closed-loop simulator was previously developed elsewherewith detailed comparison to primate-based BMI (Cunningham et al., 2011).In this simulator, the role of a neural control network in a target patient(see Figure 1A) that ultimately determines motor-cortical output is playedby the healthy human subject in this model system (see Figure 1B). Thissimulator provides a viable laboratory platform for BMI design that en-gages actual human sensorimotor behavior (Cunningham et al., 2011). Theunderlying model of primary-motor-cortical activity draws on empiricallyderived point-process models (Moran & Schwartz, 1999; Truccolo et al.,2005) to simulate the output layer of neurons recorded by the BMI system.The human-in-the-loop aspect of this model system provides a realisticbiological simulation of sensorimotor learning and online correction.

Our implementation advances this prior work in two important ways.First, we make the system affordable and accessible by using a MicrosoftKinect (currently US$100) for markerless arm tracking instead of theNorthern Digital Polaris optical tracking system (currently estimated atUS$60,000) employed previously (Cunningham et al., 2011). We have alsoreleased our MATLAB wrappers for the open-source Kinect drivers to helpreaders implement this simulation platform (Kowalski, 2012). Implementa-tion is discussed in section 2.1.1.

Second, we modify the initial conditions of the human-subjects closed-loop simulator to allow human sensorimotor learning (see Figures 7–9 and10B). Our approach is based on the visuomotor rotation task, previouslyemployed in a study of motor learning (Krakauer & Mazzoni, 2011). In thistask, visual feedback about movement is rotated by a fixed angle. For ex-ample, in a task involving point-to-point two-dimensional reaching move-ments with an on-screen cursor, the velocity of the cursor can be rotated by70 degrees clockwise. In attempting reaching movements under this visualrotation, subjects can learn to adjust their arm movements to compensatethis rotation based on errors they observe through the visual feedback of acomputer display (Krakauer & Mazzoni, 2011). Our modification based onvisuomotor rotation is discussed in section 2.1.2.

2.1.1 How Our Kinect-Based Human-Subjects Closed-Loop Simulator Works.In our version of the previously described human-based model system(Cunningham et al., 2011), we use recently developed open-source motion-capture code (OpenNI and PrimeSense NITE) for the Kinect 3D camera todigitize arm movements made by healthy human subjects. Although theKinect specification describes a motion capture rate of 30 Hz, we observedthat the Kinect-based software wrapper for MATLAB occasionally caused

Page 7: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2379

primary motorcortex

neuroprostheticdevice

device kinematics (x )

ensemble action potentials (n )

kinectintended

velocity (u )

sim. primarymotor cortex

neuroprostheticdevice

user armmovements

neural controlnetwork

Complete Brain-MachineInterface

Human SubjectsClosed-Loop SimulatorA B

sim. primarymotor cortex

neuroprostheticdevice

L-Qcontroller

Synthetic SubjectsClosed-Loop SimulatorC

ensemble action potentials (n )

simulatedneural control

network

device kinematics (x )

simulatedneural control

network

ensemble action potentials (n )

device kinematics (x )

Figure 1: Closed-loop brain-machine interface (BMI) operation in practice andwith two models. (A) Actual BMI system. The subject controls the BMI throughan output layer with tens of primary motor cortical neurons, driven by in-puts from a larger neural control network, with various recurrent connections.(B) Model system for closed-loop BMI operation based on human subjects. Here,the neural control network is represented by a healthy human subject, observ-ing on-screen cursor kinematics, and adjusting arm movements captured by theKinect, where arm velocity in the plane orthogonal to the camera represents in-tended velocity (uk). An empirically derived cosine-tuned point-process modelof motor cortical neurons converts intended velocity into spiking events from 25neurons. Actual and decoder estimated neural parameter values are redrawn atthe beginning of every learning period. (C) Model system for closed-loop BMIoperation based on a synthetic subject implemented by a linear-quadratic con-troller, modified from the recently described original stochastic optimal controlmodel for closed-loop BMI operation (Lagang & Srinivasan, 2013).

the first time step of every trial to hang for 150 ms, which was generally im-perceptible to the user. This event was detected and discarded in calculatingarm velocities. Our code for interfacing MATLAB to the Kinect for motion

Page 8: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2380 K. Kowalski, B. He, and L. Srinivasan

capture is freely available online, together with a brief tutorial (Kowalski,2012).

The human subject’s arm movements specify an internal representationof intended velocity. For simplicity, this velocity is captured in the 2D planethat is orthogonal to the Kinect, where vx and vy are velocities in orthog-onal directions. The user’s 2D arm velocity drives a standard empiricallyderived cosine-tuned point process model of motor-cortical activity (Moran& Schwartz, 1999; Truccolo et al., 2005). The conditional intensity λ(k|vx, vy)

of this point-process model defines the probability with which a neurongenerates a spike for the intended arm movement at time step k in terms ofvx and vy:

λ(k|vx, vy)= exp(β0 + β1(v2x + v2

y )1/2 cos(θ − θF )) (2.1)

= exp(avx + bvy + c). (2.2)

This relationship can be expressed equivalently in polar or Cartesian form,where c = β0, a = β1 cos θp, and b = β1 sin θp. History dependence in spikingpatterns (Truccolo et al., 2005) can be readily accommodated, as illustratedpreviously (Srinivasan et al., 2007).

All experiments include an ensemble size of 25 neurons, with tuningcurve parameters drawn at random with every new learning session. Inour purely randomized initial conditions (see Figures 4–6), parameters werechosen to result in a baseline firing rate drawn uniformly from 10 to 20 spikesper second, and a maximum firing rate drawn uniformly from 25 to 40spikes per second at a speed of 20 cm/sec. This corresponds to β0 ∈ [2.3, 3],β1 ∈ [0.0112, 0.0693], and θp ∈ [0, 360◦], where units on these parametersare concordant with the use of cm/sec for velocity and spikes/sec for firingrate.

Because this model does not specify a maximum firing rate, fast armmovements can drive neurons to fire at unrealistically high rates. To coun-teract this problem, Cunningham et al. (2011) set a maximum firing rate.Individual time bins are sized to match the maximum firing rate so thatthey most likely contain either 0 or 1 spikes. Consequently, spike count isreasonably simulated as a Bernoulli random variable with event probabil-ity modulated by intended velocity. Here, we choose a maximum allowedfiring rate of 30 spikes per second, a reasonable approximation for primary-motor-cortical neurons (Richardson, Borghi, & Bizzi, 2012; Truccolo et al.,2005), although this is not an actual hard upper bound in the brain. Thismaximum rate also matches the temporal resolution of the Kinect system,which acquires arm coordinates at approximately 30 Hz.

Spike simulation and decoding was performed on a desktop computer(3.4 GHz Intel Quad Core, 16 GB RAM), with a total latency less than 30 ms,accommodating real-time performance with the 30 Hz Kinect refresh rate.Decoded cursor movements were displayed to the user on a standard LCD

Page 9: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2381

monitor with 60 Hz refresh. Visual feedback was rudimentary, depictingtwo-dimensional cursor movements rendered in MATLAB.

2.1.2 Simulator Modification Based on Visuomotor Rotation to Permit Hu-man Learning. We modified the human-subjects simulator to permit humanlearning by adapting the visuomotor rotation task, previously employed inthe study of motor learning (Krakauer & Mazzoni, 2011). To achieve this,we changed the BMI parameter initial conditions as follows.

Recall that in Figures 4 to 6 and 10A, BMI neuron parameters are drawnat random for 25 neurons with β0 ∈ [2.3, 3], β1 ∈ [0.0112, 0.0693], and θp ∈[0, 360◦], as described in section 2.1.1. In our modification (see Figures 7–9and 10B), decoder estimates of preferred direction in a subset R of the 25neurons are rotated by a single angle from their true values rather thanrandomly assigned. This single angle of rotation is uniformly drawn at thebeginning of each learning session from [−75◦, −60◦ ] ∪ [ 60◦, 75◦]. Forthese neurons in R, the decoder β0 and β1 parameters are fixed at their truevalues. Decoder parameters for all other neurons are generated as before.

In our human-subjects closed-loop simulator experiments with partiallyrotated initialization described above (see Figures 7–9 and 10B), we choseR = 8 of the 25 neurons (32% of neurons rotated). Based on preliminarytesting, this strikes a balance between a trivial task (100% rotated) and anunreasonably hard task (0% rotated) over 26 trials within a learning sessionwhere machine learning is frozen.

2.2 Adaptive Point-Process Filter. This section is purely a review of anapproximate adaptive point-process filter, originally described elsewhere(Eden et al., 2004), which we provide for readers’ convenience. This reviewalso introduces a consistent set of variables and filter equations that willsubsequently appear in our unified perspective (see Figures 2 and 3) on vari-ous naive adaptive BMI training methods, including previously describedvariants (Dangi et al., 2011; Orsborn et al., 2012) of ReFIT (Gilja et al., 2010,2012) and our proposed method, Joint RSE.

The point-process filter translates spiking neural activity into estimatesof user intent that drive changes in the state of the assistive device, suchas cursor velocity. It also uses this neural activity to estimate parameters ofneuron tuning curves. Although experiments performed in this letter referto spiking neural activity, the concepts introduced here apply directly to anybiological signals that reflect user intent, including electroencephalography(EEG) and electromyography (EMG).

2.2.1 State Equations and Observation Models. The point-process filter is atype of recursive Bayesian estimation that requires a latent (hidden) variablemodel (also called a state equation) and an observation model. The latentvariable is a random vector xk that includes either user intention or neuralparameters, or both. The state equation describes how the latent variableis expected to evolve one time step into the future. The observation model

Page 10: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2382 K. Kowalski, B. He, and L. Srinivasan

describes the relationship between the latent variable and neural activity.In particular, the observation model specifies the probability of observinga pattern of spiking across the neural ensemble at time step k, which isdetermined by each neuron’s tuning curve, embodied in the conditionalintensity function introduced in equation 2.2.

One example of a state equation is the discrete-time linear gaussianprocess, indexed by time step k:

xk+1 = Fkxk + wk,, (2.3)

where Fk is a state evolution matrix and wk is zero-mean gaussian noise withcovariance matrix Qk. Each training method uses a different state equationor set of state equations, which we discuss in section 2.4.

2.2.2 Filter Equations. Filter equations specify how observations are usedto compute estimates of the latent variable. In our example, these equationsdetermine how spiking activity results in a cursor movement and howneural tuning curve parameters are learned. A more expansive introduc-tion to the approximate adaptive point-process filter equations is providedelsewhere (Eden et al., 2004). In this section, we review only the essentialequations, using the random walk state equation as a basic example.

The neural activity observed at time step k is denoted nk, and the la-tent variable is denoted xk. The history of neural activity is denoted Hk =[nk−1 nk−2, . . . , n1]. Define the prediction density p(xk+1|Hk+1) with meanxk+1|k and variance Wk+1|k. Define the posterior density p(xk+1|nk+1, Hk+1)

with mean xk+1|k+1 and variance Wk+1|k+1.The filter equations are divided into a prediction step, equations 2.4 and

2.5, to calculate xk+1|k and Wk+1|k and a subsequent update step, equations2.6 and 2.7, to calculate xk+1|k+1 and Wk+1|k+1:

xk+1|k = Fkxk|k, (2.4)

Wk+1|k = FkWk|kFTk + Qk, (2.5)

(Wk+1|k+1)−1 = (Wk+1|k)

−1 +N∑

j=1

[ (∂ log λ

jk

∂xk

)[λ j

k�]

(∂ log λ

jk

∂xk

)T

− (njk − λ

jk�)

∂2 log λjk

∂xk∂xTk

]xk+1|k

, (2.6)

xk+1|k+1 = xk+1|k + Wk+1|k+1

N∑j=1

[(∂ log λ

jk

∂xk

)(nj

k − λjk�)

]xk+1|k

. (2.7)

Page 11: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2383

In equations 2.6 and 2.7, λjk is the conditional intensity function in equation

2.2, corresponding to neuron j evaluated at xk+1|k. njk is the number of spikes

(0 or 1) produced by neuron j at time step k, and � is the time step duration,set at 33 ms in our experiments.

2.2.3 Joint Estimation versus Lockstep Estimation. When neural signal pa-rameters and user intention are both unknown, these latent variables canbe estimated using equations 2.4 to 2.7. In joint estimation, parameters andintention are simultaneously estimated by including both quantities in thesame latent variable vector xk. This has the beneficial property of allowinguncertainty in neural parameters to inform estimates of user intent. Forexample, if neural parameters are poorly learned, this is reflected in a largecovariance on the posterior density of the neural parameters and user in-tention is estimated with greater reliance on the state equation. Conversely,if user intention is poorly known, neural parameters are adjusted morecautiously.

Under joint estimation, the latent variable (state vector) that includesboth parameters and kinematic intention in the 2D work space is

xk = [a1k b1k c1k a2k, . . . , cNk pxk py

k vxk v

yk ]T , (2.8)

where aik, bik, and cik refer to the three neural signal parameters (see equation2.2) for cosine-tuned motor neuron i at time step k. The last four entries ofthe state vector, px

k, pyk, vx

k , and vyk , represent the intended 2D position and

velocity, respectively.In lockstep estimation, these quantities are sequentially estimated at each

time step, using equations 2.4 to 2.7 separately for estimating intent and pa-rameters. In the first stage, user intention corresponding to latent variablexk = [px

k pyk vx

k vyk ]T is estimated by assuming that the current estimate of

neural parameters is exact. In the second stage, neural parameter estimatescorresponding to latent variable xk = [a1k b1k c1k a2k, . . . , cNk]T are updatedby assuming that the current estimate of user intention is exact. These stagescan be reversed. The design philosophy underlying lockstep estimation,called certainty equivalence, is the pervasive approach in BMI algorithmdesign, including all methods based on training data that use overt move-ments (Santhanam et al., 2006; Serruya et al., 2002), or instructed motorimagery (Bradberry et al., 2011; Hochberg et al., 2006; Kim et al., 2008), aswell as previously developed naive adaptive control methods (Dangi et al.,2011; Gage et al., 2005; Orsborn et al., 2012). Certainty equivalence meansthat when parameters are estimated, the current estimate of intent is as-sumed to be the true intent (i.e., equivalent to being known with certainty),and vice versa (Bertsekas, 2005).

Page 12: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2384 K. Kowalski, B. He, and L. Srinivasan

2.3 Closed-Loop Filtering. The language of the state equation can beadjusted slightly to reflect conditions under closed-loop BMI operation.This realization is a more recent advance (Dangi et al., 2011; Gilja et al.,2010, 2012) that followed earlier uses of the recursive Bayesian framework(Srinivasan et al., 2007, Srinivasan, Eden, Willsky, & Brown, 2005, 2006; Yuet al., 2007). At time step k, the on-screen cursor reflects xk|k. Under theassumption that sensory feedback is adequate to provide the user a faithfulrepresentation of the on-screen cursor state, we can assume that the user’sintention for time step k + 1 is based on the cursor state in time step k. Assuch, the state equation, 2.3, should actually express xk, the user’s intentionat time step k + 1, in terms of the actual on-screen cursor state xk|k. Thissubtle adjustment now incorporates sensory feedback into the recursiveBayesian framework for BMI design. Although this adjustment advance(Dangi et al., 2011; Gilja et al., 2010, 2012) represents a new BMI designstrategy, practically speaking, this change simply results in reducing Wk|k tozero in equation 2.5. In practice, we found that a small, nonzero assignedvalue for Wk|k (such as with entries on the diagonal measuring 10−5 cm2

or 10−3 cm2/s2) was needed to ensure numerical stability in our MATLABimplementation.

2.4 Naive Adaptive BMI Training Paradigms. In this section, we dis-cuss five active BMI paradigms that we construct to understand why naiveadaptive control works and dissect the relative importance of joint estima-tion versus sensory feedback in this process. Each training method variesin three respects: joint estimation versus lockstep estimation, state equa-tion used to perform neural parameter estimation, and state equation usedto determine on-screen cursor position (see Figures 2 and 3). All methodsupdate neural signal parameter estimates at the resolution of neural signalacquisition, which is 33 ms in our simulation framework.

2.4.1 ReFIT-PPF: Representative for Existing Category 1 Naive Adaptive BMI.The ReFIT (also called cursorGoal) training method (Gilja et al., 2010, 2012)and its variants (Dangi et al., 2011; Orsborn et al., 2012) use lockstep esti-mation. In this two-step process, a cursor estimation filter first determinesthe on-screen cursor position. Subsequently a parameter estimation filterupdates neural parameter estimates. The frequency of alternation betweencursor estimation and parameter estimation may determine rates of learn-ing (Orsborn et al., 2012). For simplicity of exposition, this letter focuses onthe implementation where cursor and parameter estimates are updated ateach time step (Dangi et al., 2011).

The cursor estimation filter produces an estimate [vxk|k, v

yk|k] for the ve-

locity at time step k from neural activity nk by assuming that the currentestimate of the neural parameters is correct. This estimate is used to de-termine the on-screen cursor movement. The parameter estimation filter

Page 13: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2385

Lockstep Joint Joint Partial

1. estimate x

2. estimate Θ

1. estimate x

Target

Previous Decoded Velocity

Velocity Prediction Density

Decoded Velocity

ReFIT Variants Joint RSE Random Walk Static

Estimation Procedure

Prior onIntendedVelocity

Feedback to User

Figure 2: Naive adaptive control variants with directed and undirected priors.The ReFIT variants, Joint RSE, RW, and static training methods differ in threeelemental ways, as listed in the row labels: joint versus lockstep estimation, prioron intended velocity (also called the state equation or latent variable model),and the control of visual feedback to the user (cursor movement). The RWuses an undirected prior, where ReFIT-PPF and Joint RSE use different directedpriors, as defined in section 3. The various training paradigms are explained indetail in section 2.

updates the neural parameters. To accomplish this, [vxk|k, v

yk|k] is rotated to

point toward the target, retaining its magnitude. This estimate of intendedvelocity is then used in the lockstep filter component that updates neuralparameters.

A rationale for using the original unrotated [vxk|k, v

yk|k] to determine on-

screen cursor movement is that it provides error feedback that informsthe user about the BMI algorithm parameters. It is suggested that errorfeedback could potentially accelerate human learning. Note that runningthe cursor estimation filter alone without parameter updates is equivalentto static training (Ganguly & Carmena, 2010). The rationale for rotating

Page 14: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2386 K. Kowalski, B. He, and L. Srinivasan

Joint Lockstep Lockstep

Joint RSE Lockstep RSE/RSE Lockstep RSE/RW

1. estimate xk+1|k+1

2. estimate Θk+1|k+1

1. estimate xk+1|k+1

2. estimate Θk+1|k+1

Target

Previous Decoded Velocity

Velocity Prediction Density

Decoded Velocity

Estimation Procedure

Prior onIntendedVelocity

Feedback to User

Figure 3: Naive adaptive control variants to dissect the relative importance ofjoint estimation versus sensory feedback. To understand the relative contribu-tion of joint estimation and feedback to improved naive adaptive control withJoint RSE, we constructed two control methods. Lockstep RSE/RSE is nearlyidentical to Joint RSE except that lockstep estimation is used. Lockstep RSE/RWdiffers from Joint RSE in the use of lockstep estimation and the determination ofcursor movement by a random walk prior (rather than the reach state equation).In its control of feedback (cursor movement), the Lockstep RSE/RW is identicalto ReFIT-PPF.

[vxk|k, v

yk|k] in the parameter estimation filter is that neural firing reflects the

user’s intention to move from the current cursor position toward the knowntarget during training, so neural parameters should be tuned to reflect thisknown task constraint.

Implicit in this logic is zero effective sensorimotor delay—the assump-tion that neural signals representing user-intended velocity instantly reflectthe displacement vector between cursor position and target. Despite intrin-sic brain and machine hardware delays, the user might attempt to achievezero effective sensorimotor delay in adopting control strategies that exploit

Page 15: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2387

internal models of closed-loop dynamics (Golub et al., 2012). Figure 11in our analysis will illustrate that the zero effective sensorimotor delay as-sumption is a major vulnerability in ReFIT that is mitigated by our proposedmethod, Joint RSE.

For the cursor estimation filter, the state vector takes the form

xk = [pxk py

k vxk v

yk ]T . (2.9)

The matrices Fk and Qk in equations 2.4 and 2.5 are

Fk =

⎡⎢⎢⎣

1 0 � 00 1 0 �

0 0 1 00 0 0 1

⎤⎥⎥⎦ , (2.10)

Qk =

⎡⎢⎢⎣

0 0 0 00 0 0 00 0 δ 00 0 0 δ

⎤⎥⎥⎦ , (2.11)

for all k, with � = 33 ms and δ = 1 × 10−3 m2/s2. During the update step,equations 2.6 and 2.7, the neural parameter estimates a1kb1k . . . are assumedto be the correct values of the neural parameters when evaluating the par-tial derivatives of λ. The resulting velocity estimate [vx

k|k, vyk|k]T determines

cursor movement.For the parameter estimation filter, the state vector takes the form

xk = [a1k b1k c1k a2k, . . . , cNk]T . (2.12)

In this filter, Fk in equation 2.4 is the 3N × 3N identity matrix for all k, andQk in equation 2.5 is zero for all k. During the update step, the velocity[vx

k|k, vyk|k]T decoded by the cursor estimation filter is rotated to point in the

direction of the target (while preserving decoded magnitude). This rotatedvelocity is used in place of the decoded velocity when evaluating λ

jk and its

derivatives in equations 2.6 and 2.7 for the parameter estimation filter.

2.4.2 Joint RSE: Our Proposed Generalization of Category 1 Naive AdaptiveBMI. In this section, we propose a novel method for naive adaptive BMIthat we call Joint RSE, a method that represents a combination of two priorinnovations. We previously introduced the reach state equation (RSE) asa minimalistic state-space description of reaching movements (Srinivasanet al., 2006). The RSE is equivalent to a discrete-time directed random walk.Alternatively, it can be viewed as the conditional distribution on randomwalks given observations of the target, computed using a Riccati equation,

Page 16: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2388 K. Kowalski, B. He, and L. Srinivasan

as seen in the Kalman filter. Elsewhere, approximate discrete-time jointestimation was developed to adaptively track neural signal parameters(Eden et al., 2004); the basic framework for this method is reviewed insection 2.2. We had previously combined joint estimation and the RSE inillustrating the capability of a novel approximate point-process filter foradapting to changing neural signal parameters (Srinivasan et al., 2007),which is different from the problem of naive BMI training. As such, themathematical development presented in this section is essentially containedin this prior work (Eden et al., 2004; Srinivasan et al., 2006, 2007); familiaritywith these papers is recommended for analyzing this letter in detail. Thenovel methodological insight in our work is the realization that Joint RSErepresents a generalized solution of existing category 1 naive adaptive BMI.Our extensive analysis (see Figures 5–11) is directed at uncovering the basicdesign principles that allow Joint RSE to outperform existing category 1naive adaptive BMI algorithms in our model systems.

Joint RSE uses joint estimation (as with the RW method) and a reach stateequation (RSE) prior. The RSE provides a loosely constrained probabilisticmodel for how the trajectory of a reaching movement evolves over time,given partial or complete knowledge of the target location (Srinivasan et al.,2006). The RSE is the result of constraining a random walk by observationson its future state. The resulting filter simultaneously updates neural pa-rameters and cursor kinematics using the decoded velocity to determineon-screen cursor movements. This differs from the ReFIT-PPF method inall three cardinal ways described in Figure 2: estimation procedure, prioron intended velocity, and feedback to the user.

As with the random walk method, the state vector in Joint RSE takes theform

xk = [a1k b1k c1k a2k, . . . , cNk pxk py

k vxk v

yk ]T , (2.13)

but the state equation is different. To define the state equation succinctly,we first define a number of quantities, discussed in greater detail elsewhere(Cajigas & Srinivasan, 2012; Srinivasan et al., 2006).

Let F̃ be the matrix in equation 2.10, and let Q̃ be the matrix in equation2.11 so that F̃ and Q̃ are the state evolution and noise covariance matrices,respectively, that characterize the RW prior on cursor kinematics.

Let φ(t, s) be the state transition matrix that advances the mean of thedistribution on the state at time step s to its mean at time step t:

φ(t, s) =

⎧⎪⎪⎨⎪⎪⎩

max(t,s)∏i=1+min(t,s)

F̃sgn(t−s) if t �= s

I if t = s

. (2.14)

Page 17: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2389

Define the matrices (t f , t f ) and (t, t f ) for any t < t f where tf is the totalnumber of discrete time steps in the movement:

(t f , t f ) = f + Q̃, (2.15)

(t, t f ) = φ(t, t f ) f [φ(t, t f )]T +

t f∑i=t

φ(t, t f )Q̃[φ(t, t f )]T . (2.16)

For ease of implementation, (t, t f ) can be computed recursively, wherethe following recursion begins with t = t f :

(t − 1, t f ) = φ(t − 1, t)(t, t f )[φ(t − 1, t)]T + Q̃. (2.17)

Using equation 2.17, we see that the final matrices of the Joint RSE stateequation are written in terms of submatrices that correspond to neuralparameter dimensions (Fneural

k , Qneuralk ) and cursor kinematic dimensions

(Fcursork , Qcursor

k ):

Fk =[

Fneuralk 0

0 Fcursork

], (2.18)

Qk =[

Qneuralk 0

0 Qcursork

]. (2.19)

The neural parameter submatrices are the same as those in the RW:

Fneuralk = I3N, (2.20)

Qneuralk = 0, (2.21)

where I3N is the 3N × 3N identity matrix. The cursor kinematic submatricesare

Fcursork = [I − Q̃−1(k − 1, treach)]F̃, (2.22)

Qcursork = Q̃ − Q̃−1(k − 1, treach)Q̃

T , (2.23)

for k ≤ treach, where

f =

⎡⎢⎢⎣

α 0 0 00 α 0 00 0 β 00 0 0 β

⎤⎥⎥⎦ , (2.24)

with α = 1 × 10−6 m2 and β = 1 × 10−8 m2/s2.

Page 18: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2390 K. Kowalski, B. He, and L. Srinivasan

In our experiments, treach = 60 time steps, which corresponds to a reachduration of 60 time steps × 0.033 s/time step = 2.0 s. For k > treach, the stateis assumed to evolve as a random walk, with

Fcursork = F̃, (2.26)

Qcursork = Q̃. (2.27)

In successful human trials, targets were typically acquired within 1.5 sec-onds, so the k > treach regime was not typically entered.

As a side note, recall that the original formulation of the reach stateequation implies a prediction mean xk+1|k given by

xcursork+1|k = Fcursor

k xcursork|k + fk, (2.27)

where

fk = Q̃−1(k − 1, treach)φ(k − 1, treach)xreach, (2.28)

and xreach is the target kinematic vector at time treach. Because the target inour case is a static cursor position at the origin, equation 2.28 reduces tofk = 0 for all k, resulting in the linear state equation implied by equation2.18 rather than the original affine form in equation 2.27.

2.4.3 Random Walk: Testing the Effect of Undirected State Equations. Wedefine this training method as a control to examine the effect of directedversus undirected state equations in supporting parameter learning (seesection 3.1). This method is a negative control and is expected to fail. TheRW training method uses joint estimation, with a random walk state equa-tion, 2.3, where this single-state equation determines both neural parameterestimation and on-screen cursor position. When we use the joint latent vari-able defined in equation 2.8, the matrices Fk and Qk in this state equationare

Fk =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

I3N · · · 0 0 0 0...

. . ....

...0 1 0 � 00 · · · 0 1 0 �

0 · · · 0 0 1 00 0 0 0 1

⎤⎥⎥⎥⎥⎥⎥⎥⎦

, (2.29)

Page 19: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2391

Qk =

⎡⎢⎢⎢⎢⎢⎢⎢⎣

0 · · · 0 0 0 0...

. . ....

...0 0 0 0 00 · · · 0 0 0 00 · · · 0 0 δ 00 0 0 0 δ

⎤⎥⎥⎥⎥⎥⎥⎥⎦

, (2.30)

for all k, where I3N is the 3N × 3N identity matrix, with � = 33 ms andδ = 1 × 10−3 m2/s2.

2.4.4 Static Decoder: Testing the Capability for Pure Human Learning. Wedefine this training method (see Figure 2) as a control to confirm the ca-pacity for human learning in the human subjects closed-loop simulator (seeFigures 7 and 8) over the span of a single learning session. Because the staticdecoder involves no machine learning, any performance improvement canbe attributed to pure human learning. The filter equations implement anonadaptive point-process filter that uses randomly assigned, fixed esti-mates of neural signal parameters. The static decoder is identical to the RWdecoder with decoder neural parameters removed from the state vector. Inour example, the static decoder state vector is a 4 × 1 matrix of intendedposition and velocity for the cursor in a two-dimensional work space.

2.4.5 Lockstep RSE/RSE: Testing the Importance of Joint Estimation and Sen-sory Feedback. We define this training method as a control to examine thecontributions of joint estimation and sensory feedback in naive adaptivetraining. The Lockstep RSE/RSE training method is identical to the imple-mentation of Joint RSE, except that lockstep estimation is used instead ofjoint estimation. Alternatively, Lockstep RSE/RSE can be viewed as iden-tical to ReFIT-PPF, except that the reach state equation is used as the prioron intended cursor velocity, where Fk and Qkare given by equations 2.22and 2.23 for k < treach and equations 2.25 and 2.26 for k > treach. If LockstepRSE/RSE performed identically to Joint RSE, we would conclude that jointestimation was noncontributory. If Lockstep RSE/RSE performed identi-cally to Lockstep RSE/RW (defined next), we would conclude that sensoryfeedback was noncontributory.

2.4.6 Lockstep RSE/RW: Testing the Role of Sensory Feedback. This trainingmethod serves as a control to examine the effect of using sensory feedback(cursor kinematics) in naive adaptive training. This method is identicalto Lockstep RSE/RSE, except that cursor movement is determined usingthe RW state equation. This substantially modifies the quality of feedbackprovided to the user, as illustrated in our online movie demonstration(Kowalski & Srinivasan, 2012). If these two lockstep methods performedidentically, we would conclude that the nature of sensory feedback was

Page 20: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2392 K. Kowalski, B. He, and L. Srinivasan

noncontributory. If we had also directly compared Lockstep RSE/RW withReFIT-PPF in the same experimental session, we could have also directlyassessed the importance of prior on intended velocity. This final compar-ison was not performed, mainly because of limited experimentation time,although section 3.9 indirectly addresses this point.

2.5 Experimental Conditions

2.5.1 Subject Recruitment. Ten healthy male and female volunteers, ages18 to 22, participated in experiments with the closed-loop model system.This experimental protocol was approved by the Institutional Review Boardof the University of California, Los Angeles.

2.5.2 Task Description. Subjects engaged in a basic reach-and-hold task.While the subject’s arm movements were unconstrained in 3D, only velocityin the 2D plane orthogonal to the Kinect camera was used to drive simulatedneural activity. Additionally the on-screen cursor was displayed in a 2Dvirtual work space. At the start of each trial, decoded cursor movementsbegan from a random location on the starting circle. The subject would thencontrol their arm movements to drive the cursor to a fixed circular target,centered within the starting circle. Successful trial completion required thesubject to hold the cursor within the target circle for 0.5 seconds, beforethe trial expired at 3 seconds into the trial. Our 2D virtual work spacedimensions were 20 cm for the starting circle radius, 5 cm for the targetcircle radius, and a cursor of negligible size. In comparison, the previouslyvalidated human-based model system (Cunningham et al., 2011) required3D virtual movements from the origin to spherical targets of radius 2 cm,located 8 cm from the origin, using a spherical cursor of radius 2 cm.

Each subject was studied in a single session (lasting 3 to 4 hours for ex-periments in Figures 4 to 6 and 1 to 2 hours for experiments in Figures 7 to9) that tested a subset of the various naive adaptive paradigms already in-troduced: ReFIT-PPF, Joint RSE, RW, static, Lockstep RSE/RSE, or LockstepRSE/RW. Experiment 1 (four subjects, Figure 5A) compared RW, ReFIT-PPF,and Joint RSE. Experiment 2 (three additional, separate subjects; Figure 5B)compared Lockstep RSE/RSE and Lockstep RSE/RW. Experiment 3 (threeadditional, separate subjects; Figure 5C) compared Joint RSE and LockstepRSE/RSE. Experiment 4 (two additional, separate subjects; see Figures 7 to9 and 10B) compared all filters except RW.

Each reaching movement was performed as either a test trial or a trainingtrial. The test trial provided an opportunity to equitably and intermittentlycompare performance across naive adaptive methods. During test trials,parameter learning was fixed, and all methods used the RW state equationto decode cursor movement. As a result, all methods were identical intheir implementation during the test trial, except that they used differentneural signal parameter values, determined by the learning process during

Page 21: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2393

training trials. In the training trials, naive adaptive decoding was performedas described in the previous sections.

Each subject completed multiple learning session for each BMI train-ing paradigm, sequenced in random fashion, while ensuring equal repre-sentation among learning paradigms. For example, in experiment 1, ev-ery set of three learning sessions contained at least one of each type oflearning paradigm, although the ordering within this set was randomized.Each learning session began with a new, randomized selection of simulatedmotor cortical neurons and randomly configured, untrained BMI decoderparameters.

Experiments 1 to 3 (Figures 4 to 6 and 10A) involved 12 learning ses-sions per training paradigm per subject. Within each learning session ofexperiments 1 to 3, a sequence of 50 reach-and-hold trials was performedbeginning with a training trial. These trials alternated in nonrandom fash-ion between 4 training trials and 1 test trials, for a total of 10 test trialsinterspersed among 40 training trials. Experiment 4 (Figures 7 to 9 and 10B)used the same organization of learning sessions, except that for brevityof experimentation, only 6 learning sessions were performed per trainingparadigm per subject. Also, 6 test trials were interspersed between 20 train-ing trials, and each learning session began with a test trial. Data in Figures 8and 9 were collected from the same two subjects but on separate days.

There was no explicit attempt to control the visual or acoustic environ-ment beyond the presented on-screen visual feedback. In an effort to keepthem alert and engaged, subjects were allowed to listen to music during theexperiment. Because learning sessions alternated BMI training conditionsfrequently (roughly every six minutes for experiments 1 to 3 and everythree minutes for experiment 4), systematic correlations between the am-bient sensory environment and BMI training condition would be highlyunlikely. Moreover, balanced randomization in the sequence of learningsessions would have mitigated effects of task familiarity or arousal thatmight artificially introduce performance differences between BMI trainingparadigms.

2.6 Synthetic-Subjects Closed-Loop BMI Simulator. In order to sys-tematically assess the effect of sensorimotor delay on decoder performance,we used a synthetic controller based on stochastic control theory (Bert-sekas, 2005) to replace the human in the loop (see Figure 1C), recalling andadapting the modeling strategy introduced in our recently published work(Lagang & Srinivasan, 2013). This controller is the standard solution toa discrete-time finite-horizon linear quadratic control problem (Bertsekas,2005). The synthetic simulator allows us to test the case of perfectly zeroeffective sensorimotor delay, which is difficult to achieve with live subjects(Golub et al., 2012). The controller in the synthetic simulator (see Figure 1C)substitutes for both the human and Kinect arm tracking system (see Fig-ure 1B). As depicted in the diagram, the controller receives the state of the

Page 22: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2394 K. Kowalski, B. He, and L. Srinivasan

cursor as input and computes a new intended cursor velocity as output. Aswith the human-subjects simulator, the intended velocity in the synthetic-subjects simulator is sent to the motor-cortical neuron layer.

In our simulation, each trial lasts a maximum of 3 seconds. Time steps aresimulated in 33 ms intervals. Accordingly, each trial contains a maximumof M = 90 discrete time steps. In each time step 1 ≤ k ≤ M, the syntheticcontroller receives a vector yk of the current position and velocity of thecursor and outputs a 2 × 1 vector of intended velocity uk, according to theequation

uk = Lkyk. (2.31)

In particular, yk is the 5 × 1 vector,

yk =[

xk−1|k−1

1

], (2.32)

where xk−1|k−1 is the 4 × 1 vector containing the most recent position andvelocity of the cursor (or, equivalently, the decoded kinematics of the cursorfrom the time step k − 1).

For each time step k, the matrix Lk is given by the equation

Lk = −(Rk + BTk Kk+1Bk)

−1BTk Kk+1Ak, (2.33)

where Kk is given by the recursion

KM = SM, (2.34)

Kk = Sk + ATk (Kk+1 − Kk+1Bk(Rk + BT

k Kk+1Bk)−1BT

k Kk+1)Ak, (2.35)

and the remaining matrices are given by

Ak =

⎡⎢⎢⎢⎢⎣

1 0 � 0 00 1 0 � 00 0 0 0 00 0 0 0 00 0 0 0 1

⎤⎥⎥⎥⎥⎦, Bk =

⎡⎢⎢⎢⎢⎣

0 00 01 00 10 0

⎤⎥⎥⎥⎥⎦, Sk =

⎡⎢⎢⎢⎢⎣

αk 0 0 0 00 αk 0 0 00 0 0 0 00 0 0 0 00 0 0 0 0

⎤⎥⎥⎥⎥⎦,

Rk =[

β 00 β

],

for 1 ≤ k ≤ M. The constants were assigned as follows: � = 0.033 s, αk =0.067 for 1 ≤ k ≤ 75, αk = 0.33 for 75 < k ≤ M, and β = 0.0083. Note thatequations 2.33 to 2.35 are obtained directly from the classical solution to

Page 23: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2395

finite-horizon linear-quadratic control problems discussed elsewhere (Bert-sekas, 2005).

As a technical aside, the implementation in this letter differs from Lagangand Srinivasan (2013) in two ways, each of which reflects differences inthe specific task conditions between our letter and Lagang and Srinivasan(2013). First, the task for this letter requires the reach be completed within 3seconds to be successful, represented in the finite-horizon cost function. Incontrast, Lagang and Srinivasan (2013) do not impose a completion time,reflected in the infinite-horizon cost function. Second, our task for thisletter involves a naive user at the start of a learning session, represented ina control policy that assumes generic plant dynamics. In contrast, Lagangand Srinivasan (2013) simulate an approximately optimal user. There, thecontrol policy accounts for precise dynamics resulting from the compositeeffect of output neural activity and the decoder in order to model closed-loop performance at the completion of training.

3 Results

3.1 Category 1 Naive Adaptive Training Requires Directed Priors. Na-ive adaptive control methods like ReFIT-PPF and Joint RSE learn neuralsignal parameters without instructed motor imagery or explicit arm move-ments that serve as labeled data. This capability is explained by the useof goal-directed priors. To illustrate analytically, we provide a simplifiedexample in the appendix.

Intuitively, the directed prior serves to “probabilistically” label neuraldata during BMI operation. This problem formulation relates to semisu-pervised learning, a broad class of techniques where partially labeled dataare used to infer relationships and trends (Barber, 2011). The directed priorembodies the knowledge that intention for the limb state is more likely toorient toward the target state during reaching exercises and less likely toorient elsewhere. In contrast, using the undirected prior, this probabilisticlabel is agnostic to the orientation of this intention, so neural signals alonecannot drive convergence of neural signal parameter estimates.

We experimentally verified whether directed priors were essential in ac-tive BMI training using the human-based model system (see section 2).Single-trial example trajectories (see Figures 4A and 4B) illustrate thatnaive adaptive control with undirected priors is essentially unproductive,whereas directed priors support movements that acquire the target follow-ing training. These performance differences are reflected in whether neuralsignal parameters representing preferred directions converge to the truepreferred direction (see Figure 4C). Both ReFIT-PPF and Joint RSE decoderparameters converge appropriately, while random walk decoder param-eters do not. Success rates for these various methods was examined (seeFigure 5A) by aggregating across 12 learning sessions per method in eachof four subjects (with each data point representing a total of 48 learning

Page 24: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2396 K. Kowalski, B. He, and L. Srinivasan

Sample Trajectories

0 1 2 3−50

0

50Sample Trajectory x Velocity

x Ve

loci

ty (c

m/s

)

0 1 2 3−50

0

50Sample Trajectory y Velocity

Time (s)

y Ve

loci

ty (c

m/s

)

Sample Trajectories

0 1 2 3−50

0

50Sample Trajectory x Velocity

x Ve

loci

ty (c

m/s

)

Time (s)0 1 2 3

−50

0

50Sample Trajectory y Velocity

y Ve

loci

ty (c

m/s

)

Initial ValueFinal ValueTrue Value

Initial ValueFinal ValueTrue Value

Initial ValueFinal ValueTrue Value

Sample Training SessionsHuman Subjects Closed-Loop Simulator

25/25 Random Neurons

Joint RSE PreferredDirection Estimates

ReFIT-PPF PreferredDirection Estimates

Random Walk PreferredDirection Estimates

A Before Training

B After Training

C

Joint RSEReFIT−PPFRandom Walk

StartingPoints

ReachTarget

20 cm

5 cm

StartingPoints

ReachTarget 20 cm

5 cm

Page 25: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2397

sessions per method). This analysis confirms that performance differencesexist between directed and undirected priors, where random walk successrates remain essentially flat across learning sessions. These findings werealso consistent with analysis at the single-subject level (not shown).

3.2 Joint RSE Outperforms ReFIT-PPF in Naive Adaptive Control.In addition, we observed that Joint RSE outperformed ReFIT-PPF in suc-cess rates, both in aggregate (see Figure 5A) and at the single-subject level(not shown). Differences in success rate emerged by the second test trial,following eight training trials. Moreover, these differences persisted to theend of the learning session, where Joint RSE success rates averaged 94%,versus 59% for ReFIT-PPF and 9% for RW. While these improvements wereencouraging, we performed additional control experiments to isolate thesource of these improvements.

3.3 Major Differences in Sensory Feedback Do Not Significantly Af-fect Naive Adaptive Control. In experiment 2, we examined whether sub-stantial differences in sensory feedback between ReFIT-PPF and Joint RSEaccounted for differences in performance. Accordingly, we constructed aversion of Joint RSE that mimicked the ReFIT-PPF procedure (LockstepRSE/RW) and compared it to another procedure (Lockstep RSE/RSE),which was nearly identical, except that it incorporated feedback as handledby the Joint RSE. These algorithms are described in greater detail in section 2.

Early in each training session, users experienced haphazard trajecto-ries under the Lockstep RSE/RW procedure versus goal-directed trajecto-ries under the Lockstep RSE/RSE procedure. In later test sessions, cursormovements in both methods would appear increasingly goal directed as aconsequence of BMI training. Surprisingly, success rates were statisticallyindistinguishable at every test trial in the learning session (see Figure 5B).These aggregate success rates were compiled over four subjects and 12learning sessions per method in each subject.

3.4 Use of Joint Estimation Drives Performance Improvement overReFIT-PPF. Joint estimation represents a key procedural difference be-tween Joint RSE and ReFIT-PPF (see Figure 2). In ReFIT-PPF, cursor move-ment and neural signal parameters are estimated in sequence, a process

Figure 4: Single-learning-session examples of performance under naive adap-tive control with directed and undirected priors. Sample trajectories and corre-sponding velocity profiles (A) early in the training session and (B) late in thetraining session. (C) Estimates of neuron preferred direction converge to truevalues with directed priors (ReFIT-PPF, Joint RSE), but not with undirected pri-ors (RW) on this single learning session. Trajectories result from 25 simulatedneurons and 33 ms bin width.

Page 26: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2398 K. Kowalski, B. He, and L. Srinivasan

Joint RSELockstep RSE/RSE

Lockstep RSE/RSELockstep RSE/RW

Joint RSEReFIT−PPFRandom Walk

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Changes in Success Rate with Different Types of Naive Adaptive Control

Test Trial Number

Suc

cess

Rat

e

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Effect of Joint Estimation on Naive Adaptive Control

Test Trial Number

Suc

cess

Rat

e

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Effect of Feedback on Naive Adaptive Control

Test Trial Number

Suc

cess

Rat

eHuman Subjects Closed-Loop Simulator

25/25 Random Neurons

A

B

C

Page 27: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2399

called lockstep estimation. In Joint RSE, both cursor position and neuralsignal parameters are estimated simultaneously, a process called joint esti-mation. Joint estimation allows for uncertainty in neural signal parametersto influence cursor movements, and vice versa. To determine the relevanceof joint estimation to naive adaptive control, we compared versions of RSE-based naive adaptive control that were nearly identical, except that one usedjoint estimation (Joint RSE) and the other used lockstep estimation (Lock-step RSE/RSE). Aggregate results demonstrate significant and substantialdifferences in success rate over the entire learning session (see Figure 5C).In these subjects, joint estimation resulted in 90% success, while lockstepestimation resulted in 70% success using the same RSE prior.

3.5 Naive Adaptive Control Is Dominated by Fast Timescale of Ma-chine over Human Learning. We next investigated whether human senso-rimotor learning or machine learning predominantly accounted for rapidgains in BMI performance over these 6-minute training sessions. For eachnaive adaptive control method, we graphed unsigned heading deviationas a measure of human adaptation over the learning session (see Fig-ure 6A). Unsigned heading deviation is the absolute value of the minimum

Figure 5: Success rate and effects of modifications on naive adaptive control.(A) Changes in success rate with naive adaptive control. Success rates and 95%confidence intervals on success rate were determined for the RW, ReFIT-PPF, andJoint RSE methods using a Bayesian procedure designed for the specific purposeof estimating learning curves (Smith et al., 2004). Four subjects participatedin 12 learning sessions per method, so each data point is determined by thepooled successes and failures of 48 trials. Black and brown bars drawn nearthe x-axis represent alternating segments of four training trials (black) and onetest trial (brown). The test trial 0 point is extrapolated from RW performanceat test trial 1. (B) Effect of feedback on naive adaptive control. Success rateswere not significantly different in comparison between Lockstep RSE/RSE andLockstep RSE/RW methods, which are nearly identical methods, except in theway they apply feedback (cursor control). Lockstep RSE/RSE and Joint RSEfeedback methods are identical. Lockstep RSE/RW and ReFIT-PPF feedbackmethods are identical. Success rates and error bars were determined as in (A).Three new subjects (different from panel A) each participated in 12 learningsessions per method, so each point is determined by the pooled successes andfailures of 36 trials. Conventions are unchanged from panel A. (C) Effect of jointestimation on naive adaptive control. Success rates were significantly differentin comparison between Joint RSE and Lockstep RSE/RSE methods which arenearly identical methods except in the use of joint versus lockstep estimation,respectively. Success rates and error bars were determined as in panel A. Threenew subjects (different from panels A or B) each participated in 12 learningsessions per method, so each point is determined by the pooled successes andfailures of 36 trials. Conventions are unchanged from panel A.

Page 28: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2400 K. Kowalski, B. He, and L. Srinivasan

Joint RSEReFIT−PPFRandom Walk

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Timescale of Performance Improvements

Test Trial Number

Suc

cess

Rat

e

0 2 4 6 8 100

50

100

150Timescale of Human Adaptation

Test Trial Number

0 2 4 6 8 100

20

40

60

80

100

120Timescale of Machine Adaptation

Test Trial Number

A

B

C

Human Subjects Closed-Loop Simulator25/25 Random Neurons

Hea

ding

Dev

iatio

n in

Initi

al A

rm M

ovem

ent (

deg)

Dev

iatio

n fro

m In

itial

Est

imat

edD

irect

ion

Par

amet

er (d

eg)

Page 29: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2401

subtended angle between the subject’s intended velocity and the straight-line trajectory to target. In addition, we graphed changes in BMI parametersas a function of distance to the final parameter value, a measure of machineadaptation (see Figure 6B). For convenience, we reproduced the successrate graph, which tracks how system performance changes with time (seeFigure 6C). A comparison of these graphs shows that nearly 80% of per-formance change (see Figure 6C) and machine adaptation (see Figure 6B)is complete by the second test trial, while the human control strategy (seeFigure 6A) is essentially unchanged during this time. This suggests that ma-chine learning operates at a far more rapid timescale than human learningin this model system.

3.6 Visuomotor Rotation Variant Permits Human Learning in Modi-fied Human-Subjects Closed-Loop Simulator. One possible objection toour results in sections 3.1 to 3.5 is that our experiments did not engagethe capacity for human learning over the timescale of a single learningsession in any method. This might limit the relevance of these results toperformance in clinical applications. In response to this concern, we modi-fied the parameter initialization procedure described in the original humansubjects closed loop simulator (Cunningham et al., 2011), where BMI pa-rameters in 32% of these neurons (8 of 25 total neurons) were initialized to arotation from the true preferred direction. This modification is a variant onthe visuomotor rotation task used elsewhere in the study of motor learning(Krakauer & Mazzoni, 2011; see section 2.1.3 for methodological details).

We employed a static filter (no machine learning; see section 2.4.4) toassay whether our subjects could exhibit learning over the timescale of asingle learning session. Statistically significant human learning was exhib-ited in aggregate results with the static filter over two subjects (see Figure8) as well as on a single-subject basis (not shown).

3.7 Joint RSE Dominates ReFIT-PPF Even Under Modified Condi-tions When Human Learning Is Permitted. We first plotted sample cursor

Figure 6: Timescales of human learning, machine learning, and BMI perfor-mance. (A) Heading deviation as a surrogate for human sensorimotor learn-ing. Heading deviation is the minimum subtended angle between the subject’sintended velocity and the straight-line trajectory to target. (B) Changes in esti-mated preferred direction as a surrogate for machine adaptation. Deviation frominitial estimated preferred direction parameter is the minimum subtended anglebetween the initial estimated preferred direction and the current estimated pre-ferred direction, averaged over all neurons. (C) Success rate, as plotted in Figure5A, reprinted here for comparison, with timescales of (A) human and (B) ma-chine adaptation. Subjects, trial numbers, and other conventions are unchangedfrom Figure 5A.

Page 30: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2402 K. Kowalski, B. He, and L. Srinivasan

trajectories from one subject using the modified human simulator beforetraining (test trial 0; see Figure 7A) and after training (test trial 5; see Fig-ure 7B). For ease of visual comparison, we rotated all trajectories based onthe cursor initial position so that all trajectories were presented in thesegraphs as movements from top to bottom. The trajectories qualitativelysuggest that Joint RSE produces better control than ReFIT-PPF or staticdecoders. The static decoder trajectory after learning still appears qual-itatively curved, where further training with the static filter might havedemonstrated qualitatively straighter trajectories.

Aggregate results over two new subjects under our modified human sim-ulator (see Figure 8) confirmed our earlier analysis that Joint RSE dominatesReFIT-PPF (see Figure 6). The differences between Joint RSE and ReFIT-PPFmay have decreased, but this assertion is limited because different subjectswere assayed between Figures 6 and 8. Moreover, ReFIT-PPF appears todegrade gradually over time from test trials 2 through 10 (see Figure 6),whereas our second analysis involved only five test trials (see Figure 8),which may have curtailed this gradual performance degradation.

Substantially diminished differences were seen between joint and lock-step filters (see Figure 9). Because the modified and original simulators dif-fer only in BMI parameter initialization, these results show that parameterinitialization can affect the relative importance of joint versus lockstep es-timation. To understand this effect, recall that certainty equivalence meansthat when parameters are estimated, the current estimate of intent is as-sumed to be the true intent, and vice versa (Bertsekas, 2005). In our case,the modified initial conditions, chosen to allow for human learning, also in-advertently made the certainty equivalence assumption behind the lockstepfilters more valid. For neurons with BMI parameters initialized to a purerotation of their true value, the innovation term (nj

k − λjk�) in equation 2.6

was diminished, reducing uncertainty in those neuron parameters as wellas in intended velocity. This effect would not be expected in the clinicalscenario because it is difficult to imagine systematically rotating estimatesof neuron preferred directions without knowing their true values.

The overlap between Lockstep RSE/RSE and Lockstep RSE/RW inFigure 9 also confirmed that feedback still conferred no specific benefitunder these modified conditions that permitted human learning. Wherethe static filter showed a steady change in the user’s unsigned heading de-viation (see Figure 8B), the remaining filter types exhibited no statisticallydetectable change in this measure (see Figures 8B and 9B).

3.8 Other Measures of Performance. We briefly discuss other measuresof performance. The reach task explicitly and exclusively asks users tobring the cursor to the target. As such, the relevant figure of merit is targetacquisition success rate, as applied in Figures 5 to 9. In Figure 10, we plotresults from the various experimental conditions using mean integrated

Page 31: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2403

distance to target (MID), trajectory inaccuracy, and time to target (TT). MIDwas introduced elsewhere (Cunningham et al., 2011).

The user does not explicitly attempt to optimize any of these metrics. Or-dering of performance described using a success rate is generally preserved,although trends do not reproducibly achieve significance. The notable ex-ception is time to target. Because time to target is sensibly defined only fortrials where the target is acquired (in contrast to MID and trajectory inac-curacy), the RW method appears most proficient. Although RW achievesonly an approximately 10% success rate, the time to target on those trials isless than with the other methods. The case of time to target highlights theintricacy of interpreting measures that are computed on specially selectedsubsets of all trials.

3.9 Influence of Sensorimotor Delay on Differences Between JointRSE and ReFIT-PPF Illustrated with a Synthetic Closed-Loop Simulator.In section 3.7, we noted that a modified human subjects simulator showedthat Joint RSE dominated over ReFIT-PPF (see Figure 8) even when jointand both lockstep filters showed no differences (see Figure 9). What thencould explain the residual improvement in Joint RSE over ReFIT-PPF? Onepossibility is the prior on intended velocity, which is distinct from the esti-mation procedure or choice of feedback to the user (see Figure 2). ReFIT-PPFassumes zero sensorimotor delay in its prior on intended velocity. Recallfrom section 2.4.1 that sensorimotor delay is the assumption by ReFIT-PPFthat user-intended velocity manifests instantly in output neurons and im-mediately reflects the current displacement vector between cursor positionand target. This would require zero delay in the brain processes that involvesensory representation and motor control. Although Joint RSE also assumeszero sensorimotor delay, it does so with uncertainty, represented in its prioron intended velocity (depicted in Figure 2) through its use of the reach stateequation (Srinivasan et al., 2006).

To confirm this explanation, we would need to compare Joint RSE andReFIT-PPF under zero sensorimotor delay. Because the human inherentlyhas nonzero sensorimotor delay, even with the help of predictive controlstrategies (Golub et al., 2012), it is difficult to entirely null the sensorimo-tor delay in an experimental system involving human or animal subjects.Instead, we leveraged a control-theoretic model for the human operatinga BMI, along the lines of our recently published work on stochastic opti-mal control as a theory of BMI operation (Lagang & Srinivasan, 2013). Thisapproach is described in section 2.6. Using this purely synthetic approachto modeling BMI performance in closed loop, we systematically increasedthe sensorimotor delay from zero to 267 ms and 333 ms (see Figure 11).More specifically, this sensorimotor delay was introduced in the simulationpurely as a sensory delay. This testing is performed with 100% randomlyinitialized neuron parameters rather than rotated preferred directions.

Page 32: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2404 K. Kowalski, B. He, and L. Srinivasan

Static

ReFIT-PPF

Joint RSE

Static

ReFIT-PPF

Joint RSE

Sample Training SessionsHuman Subjects Closed-Loop Simulator

8/25 Random Neurons

StartingPoints

ReachTarget

20 cm

5 cm

A Before Training B After Training

StartingPoints

ReachTarget

20 cm

5 cm

StartingPoints

ReachTarget

20 cm

5 cm

StartingPoints

ReachTarget

20 cm

5 cm

StartingPoints

ReachTarget

20 cm

5 cm

StartingPoints

ReachTarget

20 cm

5 cm

Page 33: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2405

At zero delay, ReFIT-PPF performs well because the synthetic controllerpoints its intended velocity toward the target without delay. The zero delayassumption of ReFIT-PPF is perfectly satisfied. As sensorimotor delay in-creases, this assumption is increasingly violated, and ReFIT-PPF degradesrapidly. In contrast, Joint RSE does not degrade with increasing sensori-motor delay. Although the Joint RSE prior on intended velocity also pointstoward the target, this rotation is not asserted with certainty when esti-mating neural signal parameters, because the reach state equation priorinherently communicates uncertainty (see Figure 2). This representation ofuncertainty accommodates the presence of sensorimotor delay and allowsJoint RSE to perform well regardless of the precise value of this sensorimotordelay.

As a technical aside, note that Joint RSE and ReFIT-PPF perform equallyat the zero delay condition (see Figure 11A), where the synthetic model forthe human conforms perfectly to the zero sensorimotor delay assumption.Why would this be? A likely explanation is that zero sensorimotor delaypermits rapid parameter learning in the first four training trials. By testtrial 1, there is no uncertainty in neural signal parameters and kinematics,conforming to the certainty equivalence assumption in lockstep filters. Thisresults in equivalence between joint and lockstep performance reflected in aconvergence between Joint RSE and ReFIT-PPF in this condition. Differencesin feedback between these two techniques are not a separate contributorto differences between Joint RSE and ReFIT-PPF in Figure 11 because oursynthetic model for the human does not accommodate learning.

4 Discussion

Section 1.2 provides a comprehensive overview of the major findings in thisletter. This section proposes significant opportunities for revising category1 naive adaptive BMI based on our findings.

4.1 Joint Estimation Is a Key Unexploited Opportunity for Category1 Naive Adaptive BMI. We showed that Joint RSE primarily outperformsthe ReFIT-PPF method because it jointly estimates neural signal parametersand user intentions. In contrast, prior naive adaptive methods, includingReFIT-PPF, use lockstep estimation, which sequentially updates parameters

Figure 7: Sample trajectories with the modified human simulator using visuo-motor rotation. Position trajectories for Joint RSE, ReFIT-PPF, and static areplotted (A) before and (B) after training in one new subject. Qualitatively, JointRSE trajectories appear smoother and more directed than ReFIT-PPF and staticBMI following training. Trials begin at random positions on the outer perimeterwith the target at the center. Plotted trajectories have been rotated to start at thetop of the perimeter for ease of visual comparison.

Page 34: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2406 K. Kowalski, B. He, and L. Srinivasan

0 1 2 3 4 50

0.2

0.4

0.6

0.8

1Timescale of Performance Improvements

Test Trial Number

Suc

cess

Rat

e

0 1 2 3 4 50

50

100

150Timescale of Human Adaptation

Test Trial Number

0 1 2 3 4 50

20

40

60

80

100

120Timescale of Machine Adaptation

Test Trial Number

B

C

Human Subjects Closed-Loop Simulator8/25 Random Neurons

Hea

ding

Dev

iatio

n in

Initi

al A

rm M

ovem

ent (

deg)

Dev

iatio

n fro

m In

itial

Est

imat

edD

irect

ion

Par

amet

er (d

eg)

A

Joint RSEReFIT−PPFStatic

Page 35: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2407

and intention (Dangi et al., 2011; Gage et al., 2005; Gilja et al., 2010; Orsbornet al., 2012). Joint estimation allows uncertainty in parameter estimates toinform the interpretation of neural signals in generating cursor movements.Early in each training session, Joint RSE relies more heavily on the RSE priorto inform cursor movements than the neural signals themselves. During thistime, neural signals are used to refine parameter estimates. As uncertaintyabout parameter estimates decreases, Joint RSE increasingly “trusts” neuralobservations when steering cursor movements away from the RSE prior.

Arguably, methods that use joint estimation could result in the userfeeling completely disengaged from cursor movements early in the trainingperiod when the Bayesian prior strongly determines cursor movement.This point requires further investigation. In our testing, we found JointRSE responsive to basic user intentions even at the outset, because thebaseline firing rate parameter can be estimated before any training sessionbegins. In other words, the user could not trivially stop attending to the BMItraining regimen unbeknown to the training algorithm. We preliminarilyverified this (results not shown) by implementing a detector for stasis,using a mixture model based on the general purpose filter design BMIframework (Srinivasan et al., 2007). Implementing such a detector is likelyto be harder with live neural recordings, where patterns of activity couldbe more irregular.

4.2 Differences Between Joint RSE and Recent Naive Adaptive Meth-ods. How is Joint RSE different from other recently proposed naive adap-tive control methods (Taylor et al., 2002; Velliste, Perel, Spalding, Whitford,& Schwartz, 2008)? Our main goal in explicitly highlighting these differ-ences is to emphasize that performance of Joint RSE versus these othermethods is not expected to be equivalent in the final clinical system. Thesetwo approaches originate from distinct conceptual foundations. The abovemethods represent an ad hoc (heuristic) approach, while Joint RSE is aprincipled (derived) approach based on Bayesian theory. In the consequenttraining sessions, both methods begin with decoded cursor movementsthat are heavily guided by the computer. Both methods also progressivelydecrease this guidance in transferring control to the user. However, the ex-tent of machine versus human control in the above methods is determinedthrough an ad hoc weighting rule. In Joint RSE, this transfer of control

Figure 8: Timescales of learning for Joint RSE, ReFIT-PPF, and static under themodified simulator using visuomotor rotation. These curves recapitulate theanalysis in Figure 6 under the modified conditions to permit human learningover single learning sessions. Decoder neural parameters were initialized ran-domly for 8 of 25 neurons and to pure rotation of preferred direction in the rest.Data are aggregated from two new human subjects in the simulator, for a totalof 16 learning session per technique.

Page 36: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2408 K. Kowalski, B. He, and L. Srinivasan

Joint RSELockstep RSE/RSELockstep RSE/RW

0 1 2 3 4 50

0.2

0.4

0.6

0.8

1Timescale of Performance Improvements

Test Trial Number

Suc

cess

Rat

e

0 1 2 3 4 50

50

100

150Timescale of Human Adaptation

Test Trial Number

0 1 2 3 4 50

20

40

60

80

100

120Timescale of Machine Adaptation

Test Trial Number

B

C

Human Subjects Closed-Loop Simulator8/25 Random Neurons

Hea

ding

Dev

iatio

n in

Initi

al A

rm M

ovem

ent (

deg)

Dev

iatio

n fro

m In

itial

Est

imat

edD

irect

ion

Par

amet

er (d

eg)

A

Page 37: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2409

occurs as a natural consequence of joint estimation, in proportion to de-creasing uncertainty in estimates of the neural signal parameters. There isno single weighting parameter in the Joint RSE method.

There are innumerable other mathematical differences between theabove naive adaptive methods and Joint RSE. Recent preliminary workexamines a different Bayesian approach for tuning the weighting parame-ter in the aforementioned methods (Zhang, Schwartz, Chase, & Kass, 2012).In contrast to this new work (Zhang et al., 2012), Joint RSE does not beginfrom the ad hoc weighting-parameter premise of the recently developedmethods. The resulting algorithms are mathematically distinct.

4.3 Effective Sensorimotor Delay and BMI Algorithm Delay Are Ma-jor Design Considerations. The delay between sensory input and responsefrom output neurons in the patient is an intrinsic physical constraint. Prelim-inary work suggests that subjects appear to implement predictive controlto compensate this delay but that this compensation is imperfect (Golubet al., 2012). Methods that ignore this delay could result in significant per-formance losses. For example, our synthetic closed-loop simulator analysis(see Figure 11) demonstrated that sensorimotor delays dramatically erodedperformance in the ReFIT-PPF, a method that assumes zero sensorimotordelay. Performance with the ReFIT-PPF was decimated by a delay of 330 ms.In contrast, Joint RSE was entirely immune to this effect over this range ofdelays, because the prior on intended velocity accommodated an uncertainsensorimotor delay.

A second source of delay not addressed in this letter results from theneural signal algorithm itself. In existing systems, the delay can be onthe order of tens to hundreds of milliseconds. A related concept is theeffect of spike binning on closed-loop performance, illustrated elsewhere(Cunningham et al., 2011), and subsequently explained in our previouswork using a control-theoretic model for the human in closed-loop BMIoperation (Lagang & Srinivasan, 2013).

4.4 Motivation and Reward During Training Are Important DesignConsiderations. Although this letter focuses on quantifiable performancedifferences between Joint RSE and ReFIT-PPF, user experience is likely tobe equally important in clinical practice. Our figures show that substantial

Figure 9: Timescales of learning for joint versus lockstep RSE methods underthe modified simulator using visuomotor rotation. These curves recapitulate theanalysis in Figure 8 under conditions that permit human learning within a singlelearning session. As with Figure 8, decoder neural parameters were initializedrandomly for 8 of 25 neurons and to pure rotation of preferred direction in therest. Data are from the same two human subjects used in Figure 8, for a total of16 learning session per technique.

Page 38: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2410 K. Kowalski, B. He, and L. Srinivasan

0 1 2 3 4 5

10

20

30

Test Trial Number

(iii) MID to Target

0 1 2 3 4 50

20

40(iv) Trajectory Inaccuracy

Test Trial Number

0 1 2 3 4 5

10

20

30

Test Trial Number

(i) MID to Target

0 1 2 3 4 50

20

40(ii) Trajectory Inaccuracy

Test Trial Number

0 5 10

10

20

30

Test Trial Number

(v) MID to Target

0 5 100

20

40(vi) Trajectory Inaccuracy

Test Trial Number

0 5 10

10

20

30

Test Trial Number

(iii) MID to Target

0 5 100

20

40(iv) Trajectory Inaccuracy

Test Trial Number

0 5 10

10

20

30

Test Trial Number

(i) MID to Target

0 5 100

20

40(ii) Trajectory Inaccuracy

Test Trial Number

0 5 100

0.51

1.5(vii) Time to Target

Test Trial Number

Human Subjects Closed-Loop Simulator25/25 Random Neurons

Human Subjects Closed-Loop Simulator8/25 Random Neurons

MID

(cm

)M

ID (c

m)

MID

(cm

)M

ID (c

m)

MID

(cm

)

Fina

l Dis

tanc

eto

Tar

get (

cm)

Fina

l Dis

tanc

eto

Tar

get (

cm)

Fina

l Dis

tanc

eto

Tar

get (

cm)

Fina

l Dis

tanc

eto

Tar

get (

cm)

Fina

l Dis

tanc

eto

Tar

get (

cm)

Tim

e to

Targ

et (s

)

Joint RSEReFIT−PPFRandom Walk

Static

Lockstep RSE/RSELockstep RSE/RW

A

B

Page 39: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2411

differences in visual feedback did not cause differences in overall perfor-mance. However, many of our subjects retrospectively described trainingsessions with ReFIT-PPF as frustrating because cursor movements werenecessarily haphazard during training, resulting in many failed trainingtrials. In contrast, training with Joint RSE was a pleasanter user experiencebecause training trials involved smooth cursor trajectories that most oftensucceeded. These aesthetics are best illustrated in our online video demon-stration of ReFIT-PPF versus Joint RSE (Kowalski & Srinivasan, 2012). Forboth experimentalists using animal models and clinicians working withpatients over several hours or days, algorithms that cause high rates offailure early in training could destroy user motivation and consequentlyundermine the human and machine learning processes required for BMImastery.

4.5 Details of Sensory Feedback May Be Irrelevant to BMI Learningat Short Timescales. Online feedback to the user regarding task perfor-mance is believed to be vital to the learning process. Adaptive controllermodels of BMI skill acquisition (DiGiovanna, Mahmoudi, Fortes, Principe,& Sanchez, 2009; Heliot, Ganguly, Jimenez, & Carmena, 2010) suggest thatfeedback may be useful even in naive adaptive control. Our results illus-trate that efforts to optimize the precise choice of feedback could be irrel-evant during periods of training where the timescale of machine learningfar outpaces that of human learning. Conversely, when machine learninghas flattened, choice of feedback could be vital to driving subsequent per-formance improvements that rely on human learning. Connections to therobotic stroke rehabilitation literature could be illuminating in this regard,including cautionary insights on counterproductive rehabilitation strate-gies (Marchal-Crespo & Reinkensmeyer, 2009).

4.6 Masking Errors from the User During Training Could Better Co-ordinate Human and Machine Learning. While both ReFIT-PPF and JointRSE represent adaptive methods, it has been recently suggested that ma-chine adaptation may actually disrupt the training process in some imple-mentations (Judy, 2011; Orsborn, Dangi, Moorman, & Carmena, 2011). Theintuition behind this argument is that continually changing properties ofan adaptive BMI could be difficult for the user to learn. While closed-loopsimulation of coexistent human and machine learning suggests that BMItraining can successfully converge (DiGiovanna et al., 2009; Heliot et al.,

Figure 10: Other metrics of performance. (A) Mean integrated distance to target(i, iii, v), trajectory inaccuracy (ii, iv, vi), and time to target (vii) as defined inthe text, using four subjects and conditions from Figure 5. (B) These measuresusing subjects and conditions from Figures 8 and 9.

Page 40: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2412 K. Kowalski, B. He, and L. Srinivasan

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Sensorimotor Delay = 0 ms

Test Trial Number

Suc

cess

Rat

e

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Sensorimotor Delay = 267 ms

Test Trial Number

Suc

cess

Rat

e

0 2 4 6 8 100

0.2

0.4

0.6

0.8

1Sensorimotor Delay = 330 ms

Test Trial Number

Suc

cess

Rat

e

B

C

Synthetic Subjects Closed-Loop Simulator25/25 Random Neurons

A

Joint RSEReFIT−PPFRandom Walk

Page 41: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2413

2010), initial experimental results show that training improves by alternat-ing between static and adaptive BMI sessions (Orsborn et al., 2011).

The fact that Joint RSE masks training errors during adaptive BMI ses-sions could be a favorable trait during adaptive BMI sessions. Becausetraining errors are not available to drive human learning, the adaptive BMIsession under Joint RSE represents a more purely machine adaptation blockthan adaptive BMI under ReFIT-PPF. Future work could use this property ofJoint RSE to explore the possibility that coadaptive learning might achieveminimum training time by alternating between sessions that halt humanlearning and sessions that halt machine learning. For readers familiar withcontrol theory, this notion will be reminiscent of bang-bang control, whichis provably optimal in similar minimum-time control problems (Stengel,1994).

Toward this concept of optimal control in training regimens, our analy-sis (see Figure 6) also illustrates the use of surrogate quantities for trackinghuman and machine learning rates in the experimental setting. These surro-gate quantities are observable in practice for use in developing a principledrule (control policy) to switch between static and adaptive BMI training ses-sions or to continuously tune rates of machine learning in order to ensurethat a coadaptive training regimen is productive rather than disruptive.

4.7 Is the Out-to-Center Task a Trivial Version of the Classical Center-Out Task? One possible concern regarding the out-to-center reaching task(see Figures 4 and 7) is that algorithms and performance analysis presentedhere might not generalize to arbitrary starting and ending positions. Tothe contrary, the out-to-center reaching task is equivalent to the center-to-out reaching task in the distribution of cursor velocities needed to achievesuccessful trajectories. This is because these two task types are essentiallyequivalent except for a change in frame of reference. When the “center”of the screen is redefined as the target location, a center-to-out reachingtask becomes an out-to-center reaching task. For this reason, the variouscoadaptive BMI algorithms presented here extend readily to reach trainingparadigms with arbitrary starting and ending positions. Our out-to-centerreaching task is also more difficult than the classical center-to-out reaching

Figure 11: Effect of sensorimotor delay assessed with synthetic subject closed-loop simulator. In contrast to prior analyses, this analysis uses a linear quadraticcontroller in place of the human subjects, adapted from prior theoretical work(Lagang & Srinivasan, 2013). Performances for the Joint RSE, ReFIT-PPF, andRW are compared under (A) zero delay, (B) 267 ms delay, and (C) 330 ms so-matosensory delay. Specifically, output neural activity reflects on-screen cursorstate from time into the past equal to the specified delay. Sensorimotor delay isthe counterpart to delay studied elsewhere (Golub et al., 2012) introduced bythe BMI algorithm itself.

Page 42: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2414 K. Kowalski, B. He, and L. Srinivasan

task that involves only eight discrete, circumferentially placed target loca-tions. Because the out-to-center reaching task initializes the cursor positionto a random location on the starting circle, this is equivalent to an infinitenumber of possible target locations in the classical center-to-out reachingtask.

Another possible concern regarding the out-to-center reaching task isthat a trivial decoder with knowledge of the target location might entirelydisregard neural activity and guide the cursor toward the center of the workspace to achieve perfect performance. Such a decoder might not generalizeto other tasks with multiple possible target locations. However, arbitraryreaching tasks also have equivalent trivial decoders. When the startingposition and the final position are known, neural activity is not neededto perfectly drive the cursor from start to finish. The existence of trivialdecoders is not specific to the out-to-center task. In our training sessions,we specifically avoided trivial decoders. In other words, all algorithmsinvestigated in this study were uninformed about target location duringtesting periods.

4.8 Closed-Loop Simulation Widens the Development Pipeline forNaive Adaptive BMI Design. This letter demonstrates the application oftwo recently introduced closed-loop models (Cunningham et al., 2011; La-gang & Srinivasan, 2013) based on a simulated neural activity for BMIanalysis and design. We first used a human-subjects closed-loop simulator,adapted from previous work that established the validity of this approachin comparison with nonhuman primate models (Cunningham et al., 2011).We demonstrated an implementation based on the Microsoft Kinect (cur-rently US$100), which could widen access to this model. We also showedhow to modify initial conditions within this model to elicit human learningbased on visuomotor rotation (Krakauer & Mazzoni, 2011).

The computer-based simulation component of our analysis (see Fig-ure 11) adapted our recent work on modeling the patient as a stochasticcontroller in the closed-loop operation of BMI (Lagang & Srinivasan, 2013).This approach was essential to probe zero effective sensorimotor delay,which is difficult to study in live subjects due to intrinsic neural delays,even despite predictive control strategies (Golub et al., 2012). As such, thiswork adds to a small but growing body of literature that demonstrates theutility of patients who are entirely simulated by computer to understandand improve the dynamics of closed-loop BMI operation (Dangi et al., 2011;Gurel & Mehring, 2012; Heliot et al., 2010; Mahmoudi & Sanchez, 2011).

A core challenge of technology development in BMI research is thatall model systems, including simulation platforms, animal models, andepilepsy patients with subdural grids, ultimately lack the complexity of dis-ease pathogenesis in some major subset of patients that demonstrate paral-ysis, which ultimately limits conclusions about asymptotic performance orlearning dynamics in comparisons of various algorithms.

Page 43: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2415

Modeling paralysis in any of these systems (simulated, animal, or hu-man) is especially challenging because the various pathways to paralysis areso heterogeneous, including various types of stroke, various mechanisms oftrauma, and multiple pathways of neurodegeneration. As a specific exam-ple, spinal cord injury alone is heterogeneous in neurologic manifestation,affecting sensory and motor function to varying degrees depending onmechanism and anatomic extent. While the BMI literature commonly as-sumes complete loss of motor and proprioceptive function, anterior cordsyndrome is a classically described manifestation of motor paralysis that re-tains proprioception by sparing the dorsal column. Common mechanismsfor anterior cord syndrome include trauma, myelitis, and anterior spinalartery infarct (Blumenfeld, 2002). Ultimately model systems (simulated,animal, or human) provide a starting point for technology developmentthat disregard these nuances of clinical presentation, where more expen-sive but essential testing in target patients through randomized controlledtrials will be the ultimate gold standard.

4.9 Limitations of the Study. In carrying forward these insights, weacknowledge the full spectrum of fundamental limitations in using ourhuman-based model system to predict behavior in the final clinical systemand target patient population. First, our model requires the user to controlnatural arm movements, tracked by the Kinect system. In contrast, bothinvasive and noninvasive BMIs require the user to control a subset of neu-ral signals measured by a specific recording modality. Because the modelsystem and clinical system engage different user outputs, mechanisms andconstraints of learning in the neural substrate may differ; this is an openquestion.

Second, our measure for human learning, unsigned heading deviation,is not a comprehensive characterization of the human. Some motor behav-iors that could qualify as learning may not be captured by this measure.A more comprehensive approach would involve modeling the human as acontrol policy. For example, our recent theoretical work on stochastic con-trol as a model for BMI (Lagang & Srinivasan, 2013) could be extended byidentifying parameters of the control policy executed by the human usingexperimental data. This policy represents a mapping from sensory feedbackto neural signals. Parameter convergence with this modeling approach isnontrivial in this letter because of limited data: every learning session in-volves different parameter initial conditions and consequently requires adifferent human control policy. A similar approach to empirical modelingof the user was briefly suggested in related work, where longer periods ofnonstationarity might facilitate identification of the control policy (Golubet al., 2012).

Third, our model system involves visual feedback, whereas the tar-get clinical application could potentially allow richer sensory feedback.Possible examples include intact native somatosensory feedback from the

Page 44: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2416 K. Kowalski, B. He, and L. Srinivasan

paralyzed limb or artificial feedback through vibrotactile displays. Thisricher feedback could potentially accelerate human learning to a timescalethat is sufficiently fast as to be relevant to dynamic adjustments in machinelearning.

Fourth, the neural substrates of our healthy young volunteers are dif-ferent from those of target patients due to innumerable disease-related ef-fects, including cortical reorganization following trauma, cerebrovascularregulation following stroke, and metabolic changes associated with neu-rodegenerative diseases, which could affect the capacity for sensorimotorlearning and control. There are likely other limitations of our experimentalsystem, as all model systems are imperfect by construction, and testing inthe final clinical setting with a defined target patient population remainsthe ultimate gold standard in BMI design.

Appendix: Basic Example of Necessity for Directed Priorsin Category 1 Naive Adaptive BMI

In this appendix, we provide a simplified example that illustrates whydirected priors are needed in category 1 naive adaptive BMI. Considerneural observations nk at time step k that are related to intended 1D armstate xk by a neural signal parameter α in a simplified relationship as follows:

nk = αxk (A.1)

With wk ∼ N(0, σ 2), an example of an undirected prior on intended armstates is the random walk on xk:

xk+1 = xk + wk. (A.2)

An example of a directed prior is a random walk on xk with known drift b:

xk+1 = xk + b + wk. (A.3)

Define increments of the observation process zk+1 = nk+1 − nk. These zk areindependent and identically distributed gaussian random variables. For theundirected prior,

zk ∼ N(0, α2σ 2). (A.4)

For the directed prior,

zk ∼ N(αb, α2σ 2). (A.5)

Page 45: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2417

In the context of the training regimen, a goal is defined to the user, effectivelyconstraining b. For the directed prior, estimating the neural signal parameterα involves computing the mean of samples zk and dividing by b:

α̂ML =∑N

k=1 zk

b. (A.6)

For the undirected prior, α is essentially undetermined in sign, where thesample variance on zk is dependent on α2 and the sample mean is 0.

Acknowledgments

K.K. was supported by funding from the UCLA Amgen Scholars Pro-gram. L.S. was supported by funding from the American Heart AssociationScientist Development grant (11SDG7550015), the DARPA Reliable Neural-Interface Technology (RE-NET) Program, and the UCLA Radiology Ex-ploratory Development Grant. We thank Alexander Wein for his help inorganizing experiments and Theodore Koenig, Luis Armendariz, and Sia-mak Yousefi for their help in beta testing the MATLAB-Kinect interfaceinstallation procedure described in our online tutorial. We declare no com-peting financial interests.

References

Barber, D. (2011). Bayesian reasoning and machine learning. Cambridge, New York:Cambridge University Press.

Bertsekas, D. P. (2005). Dynamic programming and optimal control (3rd ed.). Belmont,MA: Athena Scientific.

Blumenfeld, H. (2002). Neuroanatomy through clinical cases. Sunderland, MA: Sinauer.Bradberry, T. J., Gentili, R. J., & Contreras-Vidal, J. L. (2011). Fast attainment of

computer cursor control with noninvasively acquired brain signals. J. NeuralEng., 8(3), 036010.

Cajigas, I., & Srinivasan, L. (2012). Erratum: “A State-Space Analysis for Reconstruc-tion of Goal-Directed Movements Using Neural Signals.” Neural Computation,24(4), 1106–1107.

Cunningham, J. P., Nuyujukian, P., Gilja, V., Chestek, C. A., Ryu, S. I., & Shenoy, K. V.(2011). A closed-loop human simulator for investigating the role of feedbackcontrol in brain-machine interfaces. J. Neurophysiol., 105(4), 1932–1949.

Dangi, S., Gowda, S., Heliot, R., Carmena, J. M. (2011). Adaptive Kalman filtering forclosed-loop brain-machine interface systems. In Proceedings of the 5th IEEE EMBSConference on Neural Engineering (pp. 609–612). Piscataway, NJ: IEEE.

DiGiovanna, J., Mahmoudi, B., Fortes, J., Principe, J. C., & Sanchez, J. C. (2009). Coad-aptive brain-machine interface via reinforcement learning. IEEE Trans. Biomed.Eng., 56(1), 54–64.

Page 46: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2418 K. Kowalski, B. He, and L. Srinivasan

Eden, U. T., Frank, L. M., Barbieri, R., Solo, V., & Brown, E. N. (2004). Dynamic anal-ysis of neural encoding by point process adaptive filtering. Neural Computation,16, 971–998.

Gage, G. J., Ludwig, K. A., Otto, K. J., Ionides, E. L., & Kipke, D. R. (2005). Naivecoadaptive cortical control. J. Neural Eng., 2(2), 52–63.

Ganguly, K., & Carmena, J. M. (2009). Emergence of a stable cortical map for neuro-prosthetic control. PLoS Biol, 7(7), e1000153.

Ganguly, K., & Carmena, J. M. (2010). Neural correlates of skill acquisition with a cor-tical brain-machine interface. J. Mot. Behav., 42(6), 355–360. doi:10.1080/00222895.2010.526457

Gilja, V., Nuyujukian, P., Chestek, C. A., Cunningham, J. P., Yu, B. M., Fan, J. M.,et al. (2012). A high-performance neural prosthesis enabled by control algorithmdesign. Nat. Neurosci., 15(12), 1752–1757. doi:10.1038/nn.3265

Gilja, V., Nuyujukian, P., Chestek, C. A., Cunningham, J. P., Yu, B. M., Ryu, S. I., et al.(2010). A high-performance continuous cortically-controlled prosthesis enabledby feedback control design. SfN Abstract Viewer/Itinerary Planner, Program/Poster20.26.

Golub, M. D., Yu, B. M., & Chase, S. M. (2012, Aug. 28 2012-Sept. 1 2012). Internalmodels engaged by brain-computer interface control. Paper presented at the Engineer-ing in Medicine and Biology Society, 2012 Annual International Conference ofthe IEEE.

Gurel, T., & Mehring, C. (2012). Unsupervised adaptation of brain-machine interfacedecoders. Front. Neurosci., 6, 164. doi:10.3389/fnins.2012.00164

Heliot, R., Ganguly, K., Jimenez, J., & Carmena, J. M. (2010). Learning in closed-loopbrain-machine interfaces: Modeling and experimental validation. IEEE Trans.Syst. Man Cybern. B Cybern., 40(5), 1387–1397.

Hochberg, L. R., Bacher, D., Jarosiewicz, B., Masse, N. Y., Simeral, J. D., Vogel, J., et al.(2012). Reach and grasp by people with tetraplegia using a neurally controlledrobotic arm. Nature, 485(7398), 372–375. doi:10.1038/nature11076

Hochberg, L. R., Serruya, M. D., Friehs, G. M., Mukand, J. A., Saleh, M., Caplan,A. H., et al. (2006). Neuronal ensemble control of prosthetic devices by a humanwith tetraplegia. Nature, 442(7099), 164–171.

Judy, J. (2011). Reliable Central-nervous-system interfaces (RCI): A BAA in the programon Reliable Neural Technology (RE-NET). Defense Advanced Research ProjectsAgency. https://safe.sysplan.com/renet/files/rci/2_Judy.pdf

Kim, S. P., Simeral, J. D., Hochberg, L. R., Donoghue, J. P., & Black, M. J. (2008).Neural control of computer cursor velocity by decoding motor cortical spikingactivity in humans with tetraplegia. J. Neural Eng., 5(4), 455–476. doi:10.1088/1741-2560/5/4/010

Kowalski, K. (2012). MATLAB-Kinect interface code.Kowalski, K., & Srinivasan, L. (2012). Simulation of two methods in co-adaptive control

for brain-machine interfaces. http://hdl.handle.net/1721.1/70975Krakauer, J. W., & Mazzoni, P. (2011). Human sensorimotor learning: adaptation,

skill, and beyond. Curr. Opin. Neurobiol., 21(4), 636–644. doi:10.1016/j.conb.2011.06.012

Lagang, M., & Srinivasan, L. (2013). Stochastic optimal control as a theory ofbrain-machine interface operation. Neural Comput., 25(2), 374–417. doi:10.1162/NECO_a_00394

Page 47: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

Dynamic Analysis of Naive Adaptive BMI 2419

Leuthardt, E. C., Pei, X. M., Breshears, J., Gaona, C., Sharma, M., Freudenberg, Z.,et al. (2012). Temporal evolution of gamma activity in human cortex during anovert and covert word repetition task. Front. Hum. Neurosci., 6, 99. doi:10.3389/fnhum.2012.00099

Li, Z., O’Doherty, J. E., Lebedev, M. A., & Nicolelis, M. A. (2011). Adaptive decod-ing for brain-machine interfaces through Bayesian parameter updates. NeuralComput., 23(12), 3162–3204. doi:10.1162/NECO_a_00207

Mahmoudi, B., & Sanchez, J. C. (2011). A symbiotic brain-machine interface throughvalue-based decision making. PLoS One, 6(3), e14760. doi:10.1371/journal.pone.0014760

Marchal-Crespo, L., & Reinkensmeyer, D. J. (2009). Review of control strategies forrobotic movement training after neurologic injury. J. Neuroeng. Rehabil., 6, 20.doi:10.1186/1743-0003-6-20

McFarland, D. J., Sarnacki, W. A., & Wolpaw, J. R. (2010). Electroencephalographic(EEG) control of three-dimensional movement. J. Neural Eng., 7(3), 036007.doi:10.1088/1741-2560/7/3/036007

Miller, K. J., Schalk, G., Fetz, E. E., den Nijs, M., Ojemann, J. G., & Rao, R. P. (2010).Cortical activity during motor execution, motor imagery, and imagery-basedonline feedback. Proc. Natl. Acad. Sci. USA, 107(9), 4430–4435. doi:10.1073/pnas.0913697107

Moran, D. W., & Schwartz, A. B. (1999). Motor cortical representation of speed anddirection during reaching. J. Neurophysiol., 82(5), 2676–2692.

Orsborn, A. L., Dangi, S., Moorman, H. G., & Carmena, J. M. (2011). Exploringtimescales of closed-loop decoder adaptation in brain-machine interfaces. InConf. Proc. IEEE Eng. Med. Biol. Soc., 2011 (pp. 5436–5439). Piscataway, NJ: IEEE.doi:10.1109/IEMBS.2011.6091387

Orsborn, A. L., Dangi, S., Moorman, H. G., & Carmena, J. M. (2012). Closed-loop de-coder adaptation on intermediate time-scales facilitates rapid BMI performanceimprovements independent of decoder initialization conditions. IEEE Transac-tions on Neural Systems and Rehabilitation Engineering, 20, 468–477.

Richardson, A. G., Borghi, T., & Bizzi, E. (2012). Activity of the same motor cor-tex neurons during repeated experience with perturbed movement dynamics. J.Neurophysiol., 107, 3144–3154. doi:10.1152/jn.00477.2011

Santhanam, G., Ryu, S. I., Yu, B. M., Afshar, A., & Shenoy, K. V. (2006). A high-performance brain-computer interface. Nature, 442(7099), 195–198.

Schalk, G., Miller, K. J., Anderson, N. R., Wilson, J. A., Smyth, M. D., Ojemann,J. G., et al. (2008). Two-dimensional movement control using electrocorticographicsignals in humans. J. Neural Eng., 5(1), 75–84. doi:10.1088/1741-2560/5/1/008

Serruya, M. D., Hatsopoulos, N. G., Paninski, L., Fellows, M. R., & Donoghue, J. P.(2002). Instant neural control of a movement signal. Nature, 416(6877), 141–142.

Shenoy, P., Krauledat, M., Blankertz, B., Rao, R. P., & Muller, K. R. (2006). To-wards adaptive classification for BCI. J. Neural Eng., 3(1), R13–R23. doi:10.1088/1741-2560/3/1/R02

Smith, A. C., Frank, L. M., Wirth, S., Yanike, M., Hu, D., Kubota, Y., et al. (2004).Dynamic analysis of learning in behavioral experiments. J. Neurosci., 24(2), 447–461. doi:10.1523/JNEUROSCI.2908-03.2004

Srinivasan, L., Eden, U. T., Mitter, S. K., & Brown, E. N. (2007). General purpose filterdesign for neural prosthetic devices. J. Neurophysiol., 98, 2456–2475.

Page 48: Dynamic Analysis of Naive Adaptive Brain-Machine …ai.stanford.edu/~bryanhe/publications/bci_analysis_nec… ·  · 2016-06-20design are still needed to facilitate the elementary

2420 K. Kowalski, B. He, and L. Srinivasan

Srinivasan, L., Eden, U. T., Willsky, A. S., & Brown, E. N. (2005). Goal-directedstate equation for tracking reaching movements using neural signals. In Proc. 2ndInternat IEEE EMBS Conf. on Neural Engineering (pp. 352–355). Piscataway, NJ:IEEE.

Srinivasan, L., Eden, U. T., Willsky, A. S., & Brown, E. N. (2006). A state-spaceanalysis for reconstruction of goal-directed movements using neural signals.Neural Computation, 18, 465–494.

Stengel, R. F. (1994). Optimal control and estimation. New York: Dover.Taylor, D. M., Tillery, S. I., & Schwartz, A. B. (2002). Direct cortical control of 3D

neuroprosthetic devices. Science, 296(5574), 1829–1832.Truccolo, W., Eden, U. T., Fellows, M. R., Donoghue, J. P., & Brown, E. N. (2005). A

point process framework for relating neural spiking activity to spiking history,neural ensemble, and extrinsic covariate effects. J. Neurophysiol., 93(2), 1074–1089.doi:10.1152/jn.00697.2004

Velliste, M., Perel, S., Spalding, M. C., Whitford, A. S., & Schwartz, A. B. (2008).Cortical control of a prosthetic arm for self-feeding. Nature, 453(7198), 1098–1101.doi:10.1038/nature06996

Wolpaw, J. R., McFarland, D. J., Neat, G. W., & Forneris, C. A. (1991). An EEG-basedbrain-computer interface for cursor control. Electroencephalogr. Clin. Neurophysiol.,78(3), 252–259.

Yu, B. M., Kemere, C., Santhanam, G., Afshar, A., Ryu, S. I., Meng, T. H., et al. (2007).Mixture of trajectory models for neural decoding of goal-directed movements. J.Neurophysiol., 97, 3763–3780.

Zhang, Y., Schwartz, A. B., Chase, S. M., & Kass, R. E. (2012). Bayesian learning inassisted brain-computer interface tasks. In Proceedings of the 34th Annual Interna-tional Conference of the IEEE EMBS (p. 2740). Piscataway, NJ: IEEE.

Received October 1, 2012; accepted February 28, 2013.