a presentation of doctoral dissertation
TRANSCRIPT
20-May-21 | 1
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Ph D candidate Cong Phat Vo
Advisor Professor Kyoung Kwan Ahn
Title
A Sensorless Reflecting Control for
Bilateral Haptic Teleoperation System based on
Pneumatic Artificial Muscle Actuators
Fluid Power Control and Machine Intelligence Laboratory
20-May-21 | 2Presenter Cong Phat Vo
CONTENTS
Modeling and platform setup based on a APAM configuration
Adaptive Gain ITSMC with TDE scheme for the position control
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
2
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
20-May-21 | 3Presenter Cong Phat Vo
CONTENTS
Modeling and platform setup based on the APAM configuration
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
2
7
3
4
5
6
C
H
A
P
T
E
R 7
8
Introduction1
1 2 3 4 5 6 7 8
20-May-21 | 4Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe controllers are required to be robustly stable with respect to a
specified set of uncertainties introduced by the different components
(operator environment communication channel as well as sensors
introduce uncertainties)
20-May-21 | 5Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe main purpose is to provide technical means to successfully perform a
desired task in the environment (achieved a high task performance)
For evaluation physically accessible quantities (task completion time
measurement error applied forces or dissipated energies)
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 2Presenter Cong Phat Vo
CONTENTS
Modeling and platform setup based on a APAM configuration
Adaptive Gain ITSMC with TDE scheme for the position control
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
2
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
20-May-21 | 3Presenter Cong Phat Vo
CONTENTS
Modeling and platform setup based on the APAM configuration
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
2
7
3
4
5
6
C
H
A
P
T
E
R 7
8
Introduction1
1 2 3 4 5 6 7 8
20-May-21 | 4Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe controllers are required to be robustly stable with respect to a
specified set of uncertainties introduced by the different components
(operator environment communication channel as well as sensors
introduce uncertainties)
20-May-21 | 5Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe main purpose is to provide technical means to successfully perform a
desired task in the environment (achieved a high task performance)
For evaluation physically accessible quantities (task completion time
measurement error applied forces or dissipated energies)
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 3Presenter Cong Phat Vo
CONTENTS
Modeling and platform setup based on the APAM configuration
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
2
7
3
4
5
6
C
H
A
P
T
E
R 7
8
Introduction1
1 2 3 4 5 6 7 8
20-May-21 | 4Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe controllers are required to be robustly stable with respect to a
specified set of uncertainties introduced by the different components
(operator environment communication channel as well as sensors
introduce uncertainties)
20-May-21 | 5Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe main purpose is to provide technical means to successfully perform a
desired task in the environment (achieved a high task performance)
For evaluation physically accessible quantities (task completion time
measurement error applied forces or dissipated energies)
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 4Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe controllers are required to be robustly stable with respect to a
specified set of uncertainties introduced by the different components
(operator environment communication channel as well as sensors
introduce uncertainties)
20-May-21 | 5Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe main purpose is to provide technical means to successfully perform a
desired task in the environment (achieved a high task performance)
For evaluation physically accessible quantities (task completion time
measurement error applied forces or dissipated energies)
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 5Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessThe main purpose is to provide technical means to successfully perform a
desired task in the environment (achieved a high task performance)
For evaluation physically accessible quantities (task completion time
measurement error applied forces or dissipated energies)
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 6Presenter Cong Phat Vo
hellip
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
Robustness
PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can
distinguish a remote environment that presence is correlated with task
performance in a positive causal way It implies that by improving the feeling
of presence task performance is improved as well
Perception
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 7Presenter Cong Phat Vo
hellip
Performence
Perception
Transparency
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
1 2 3 4 5 6 7 8
RobustnessCompared to perception it is a quantitative objective Transparency and
robustness are conflicting objectives (trade-off) A measure for
transparency is fidelity It describes the ability to exactly display the
environment to the operator
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 8Presenter Cong Phat Vo
1 INTRODUCTION
Overview of the BHTS structure
Slave
control
Master
control
Co
mm
un
icatio
n c
ha
nn
el
Monitor
Operator
Master
Haptic interface
Slave
teleoperator
Remote
environment
Vision
sensor
hf
m
m
x
x
sdw
s
s
x
x
efmdwm
m
x
x
Figure 11 The bilateral haptic teleoperation architecture
Variable Impedance Control [5 6]
Model-mediated Teleoperation [40]
Movement Estimation [78 79]
Passivity [27] Prediction [39] hellip
Robustness
Performence
Perception
Transparency
Environment condition
Operatorrsquos characteristics
Communication channel
1 2 3 4 5 6 7 8
classify
evaluate
achieve
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 9Presenter Cong Phat Vo
1 INTRODUCTION
PAM actuator
Fig 12 (a) Structure and (b) working principle of the PAM
KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]
Rubber hose Fitting connectionLoose-weave
fibersUn-Pressured state
Pressured state
Stroke
(a) (b)
Rubber hoseLoose-weave
fibers
Fitting
connection
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 10Presenter Cong Phat Vo
1 INTRODUCTION
Research Objectives
Propose Position
Tracking Controller
An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique
0102
Propose Model-free Control for
slave-environment interaction03
Propose Force Sensorless refleting
control for bilateral haptic teleoperation
Propose Finite-Time
Force Controller
An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time
Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort
Propose optimal impedance for
human-master interaction
05
04Applying RL approximation method to learning the optimal contact force (minimal) online
The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 11Presenter Cong Phat Vo
CONTENTS
Adaptive Gain ITSMC with TDE scheme
Adaptive Finite-Time Force Sensorless Control scheme with TDE
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
3
4
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 12Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Mathematical modeling
n turns
nπD
bL
1 t
urn
s
D
2 2
2
3
4
dV dV d b LF P P P
dL dL d n
minus= minus = minus = minus
The tensile force of each PAM [7]
Fig 21 A cross-sectional view of the simplified geometric model
The motion equation of the system
1 2Mx Bx F F+ = minus +
Parameter Description Value UnitPs Supply pressure 65 bar
B Viscous damping coefficient 50 Nsmm
M Mass of the plate 05 Kg
L0 Original length of the PAM 175 mm
n Number of turns of the thread 104
b Thread length 200 mm
Table 21 Parameters of the studied PAM
The motion equation of the system
0 0( ) ( )x f x x g x u d= + +
where f0(x) g0(x) are the nominal value of f(x)
g(x) d is a total of unknown disturbances
( )2 2 2
00 0
2 2
0 0
33 ( ) ( )
2
( ( ) ( )) ( ( ) ( ))
L x bP L x Bxf x g x
n M M n M
d f x x f x x g x g x u
+ minus = minus minus = = minus + minus +
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 13Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Experimental modeling
The approximated PAM model based on its
pressure and contraction strain
1 0
1 0
1 0
( ) ( ) ( )
( )
( ) ( 1 2)
( )
k k k k k k k k k k
k k k k k
k k k k k
k k k k k
M b P k P f P
b P b P b
k P k P k k
f P f P f
+ + =
= +
= + = = +
Parameter Value Units
Spring
elements
kk1 11732 Nmbar
kk0 68 Nm
Damping
elements
bk1 962 Nmsbar
bk0 085 Nms
Force
elements
fk1 9122 Nbar
fk0 473 N
0-100
6
0
100
55
200
Fo
rce
[N
]
Contraction length
[mm]
300
400
4
Pressure [bar]
500
103 2 151
Fig 23 Experimental model for certain separate pressure
Table 22 List of APAM model parameters
Load
PAM 1 PAM 2Plate
Linear encoder
Proportional pressure
regulator valves
Card
PCI-1711
Card
PCI-QUAD04
Pressure signal
Control signal
Position signal
Pulley
Fig 22 Block diagram of the experimental system
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 14Presenter Cong Phat Vo
2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION
Platform setup
Fig 25 The APAM-based (a) master and (b) slave subsystem configuration
Proportional
pressure
regulator
valves
Optical
encoder
Control stick
Loadcells
Pneumatic
Muscle
Actuators Interactive
Loadcell
Pulley
Front viewBack view(a)
Proportional valve
Loadcell DamperSpring
Environment
APAM system
PAMPulley
PositionSignal
ControlSignal
StiffnessDamping
Contact Force Signal
PCPCI Cards
Control box
(b)
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 15Presenter Cong Phat Vo
2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM
Slave robot
Environment
Master robot
Control stick
Device Description
PAM
Festo MAS-10-200N-AA-MC-O
Nominal length 200 mm
Inside diameter 10 mm
Max pressure 8 bar
PPR ValveFesto VPPM-8L-L-1-G14-0L8H
Max pressure 8 bar
DAQ Card
ADVANTECH PCI-1711
AIAO 12 bits (resolution)
MEASUREMENT PCI-QUAD04
Linear
encoder
Type Rational WTB5-0500MM
Resolution 0005 mm
Table 23 Specifications of the exp devices
Fig 26 Photograph of the experimental apparatus of the overview BHTS
Platform setup
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 16Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
7
8
Adaptive Gain ITSMC with TDE scheme for the position control3
Modeling and platform setup based on the APAM configuration2
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 17Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Proposed control idea
3The control effectiveness is verified by experimental trials in
different challenging work conditions (compare several controllers)
2
The stability of the controlled system is theoretically analyzed
The first time the AITSMC-TDE scheme
is proposed for a PAM system1
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 18Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
2 1 1 2 1s e k e k e= + +
( ) ( ) (( ) ( ) )
0
1
0 1 2 2 1 2
0 1 2 0 1 1
t
t
n d
s t s t k e k e e
f x x g x u x d
minus= minus minus +
+ + minus
( ) ( )( ) ( ) 31
0 1 3 3 4ˆ
su t g x sign
minus = minus + +
( )( ) ( )
( )3 3 0
3
ˆ ˆ1ˆ or
( )mt t
tsign t
= +=
+ =
A terminal sliding surface is chosen
A proposed integral sliding surface is designed
An adaptive law for the robust gain
A robust control signal can be redesigned
Proof See more detail in Appendix 32 33
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Control design
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 19Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
( ) ( ) ( )( )( )( ) ( )
2 0 1 2
0 1
ˆ
d d d
d d
d x t t f x t t x t t
g x t t u t t
= minus minus minus minus
minus minus minus
( ) ( )1
1ˆ
n su t u u g x dminus
= + minus
The control law can be rewritten
where
Proof See more detail in Appendix 31 34-35
Control design
1 1 1 1
2 2 1
de x x
x
= = minus = minus
( )( ) ( )1 2
1
0 1 1 1 0 1 2
1
1 1 1 1 1 2 2 2 2
n du g x x f x x
minus
minus
= minus minus
minus minus minus minus
The auxiliary variables are defined
The nominal control signal can be designed as
1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 20Presenter Cong Phat Vo
Parameters
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Experimental validation
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 31 Performance response in case study 1
Fig 32 Performance response in case study 2
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 21Presenter Cong Phat Vo
Fig 33 Performance response in case study 3
Fig 34 Performance response in case study 4
Controller PID ITSMC Proposed
Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574
L2[u] 09961 10270 09602
Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788
L2[u] 11284 11152 10900
MultistepL2[e] 08587 06846 06368
L2[u] 07670 07333 07247
Sine 1 Hz ndash 8 mm
with 2-kg-load
L2[e] 12578 08314 04779
L2[u] 11414 11506 11287
Sine 1 Hz ndash 8 mm
with 7-kg-load
L2[e] 15937 10827 05067
L2[u] 11613 11729 11584
Table 31 A quantitative comparison between the tracking performances
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Case study 1 (no load) xd (mm) = 8sin(04πt)
Case study 2 (no load) xd (mm) = 8sin(πt)
Case study 3 (no load) A multistep trajectory
Case study 4 (with load) xd (mm) = 8sin(πt)
in turn 2-kg-load and 7-kg-load
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 22Presenter Cong Phat Vo
3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL
Summary
Robust performance against uncertainties and disturbances1 2d dx x
Time-delay
Estimator
The PAM system
Control law
Robust
control signal
Adaptation
law
Nominal
control signal
Transfer
Coordinate Integral Terminal
Sliding Surface
Terminal
Sliding Surface
Reference
Trajectory
d
The proposed
control approach
u
1 2x x
su
3ˆ
1 2x x
s
1
2
nu
Finite-time convergence of the tracking errors
Reducing the chattering phenomenon
Eliminating the reaching phase
Fast transient response
Next works will be directed to the control using the force properties
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 23Presenter Cong Phat Vo
CONTENTS
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
5
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Adaptive Finite-Time Force Sensorless Control scheme4
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 24Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
3 Verified the reliability and superiority of the proposed control
demonstrated in different challenging work conditions
2
The FITSMC guarantees the finite convergence new adaptive gain and TDE
are deployed to handle online (uncertainties and chattering phenomenon)
The proposed FSOB scheme obtains
force information eliminated noise and
static friction
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 25Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
( )[ ]
1 20
( ) ( ( )) = + +t
v
f f fe e sign e d
( ) ( )[ 1]5
5
4
ˆ
+
minus= minus
v
signsign
1 1 [ ]
1 2
[ ]
4 5
ˆ( ) ( ) ( )
ˆ ( )
v
c d f f
v
u t d t K e sign e
sign
minus minus= + + +
+ +
1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t
The FITSM surface is defined
The control law can be rewritten
2
2
ˆ ˆ ˆ2ˆ
+ += e e e e e e
e
e
An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]
APAM system
Force based
Time-delay
Estimator
Integral Terminal
Sliding SurfaceControl law
Adaptive law
Environment
Configuration
Reference
Trajectoryu
d
ˆe
d
+
+
The proposed control
approach
2k
Plant + Environment
Force Observer
e
Proof See in Appendix 41 ndash 43
Control design
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 26Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10
κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911
for adaptative gain μ = 10 δ = 005 σ = 01
for case study 3 KP = 15 KI = 18 KD = 005
for other case KP = 35 KI = 28 KD = 01
Proposed control
PID control
Fig 42 Performance response in Scenario 2
Scenario 1 τd (N) = 75sin(02πt- π2)+625
Scenario 2 τd (N) = 75sin(08πt- π2)+625
Scenario 3 A multistep trajectory with a LPFrsquos bandwidth
of 02(rads)
Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12
α3 = 5 α4 = 15
Fig 42 Performance response in Scenario 1
Parameters
Experimental validation
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 27Presenter Cong Phat Vo
4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME
Fig 43 Performance response in Scenario 3
Fig 44 Control gain
Controller PID ISMC FITSMC-TDE Proposed
Sine 01 Hz 21050 12152 09440 05158
Sine 04 Hz 32054 18045 20632 19807
Multistep 26508 23548 20618 17685
Table 41 A RMSE comparison between the tracking error performances
Experimental validation
- Improved the control performances (fast response high
accuracy and robustness)
- Guaranteed the finite convergence
- Cancelling the lumped uncertainties via TDE solution and
the switching gain in the FITSMC online
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 28Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
6
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Model-free control for Optimal contact force in Unknown Environment5
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 29Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
3 Verified effective for robot interaction control in
unknown environments
2 The stability of the closed-loop control is theoretically demonstrated
by the Lyapunov approach
The RL method is used to learn on-line
the optimal desired force1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 30Presenter Cong Phat Vo
( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t
( ( )) ( )r dZ e t F t= [ ]
1 2
( ) ( ) ( )
( ) ( )
= + +
minus minus minus +
r r r
v
e
Me t Ce t Ke t
sign F t
The force control
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The environment is modeled as
a spring-damper system [81]
The optimal impedance model The position control in task space
( ) ( ) ( )
= + f Pf f If f
t
x s K e t K e d
Task-specific Outer-loopRobot-specific Inner-loop
Precise position
control
Prescribed
Impedance Model
Environment
Configuration
APAM system
xx
d
Force controlx
rx
fe
r
Fe
Fd
ef
-
+ -+ -
+
RL-based
approximation
ϕ
Control design
1 2 3 4 5 6 7 8
( ) ( ) ( )r r de t Ae t BF t= +
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 31Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
LQR solution
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
( ) ( )=d e rF t e t
The discounted cost function
( ) ( )( ) ( ) ( ) ( ) ( ) ( )
= minus +T T
s r d r r d dt
J e t F t e Se F RF d
The Algebraic Riccati equation
0+ + + =T T
eA PA S P A PB
The feedback control gain
( )1
1 minus
minus= minus +T T
e B PB R B PA
IRL approximationRL solution
Implementation procedure
1 2 3 4 5 6 7 8
offline soliton
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 32Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
impossible for all time
The Hamilton-Jacobi-Bellman equation
( ) ( ) ( )min ( ) ( ) ( ) 0
minus minus + minus =
d
t
s rtF
V e t r t r e dt
The new Algebraic Riccati equation
( )( ) ( )
2 ( ) ( ) ( ) ( ) 0
T T
r r
T T
r d d d
e t A P PA S P e t
e t PBF t F t RF t
+ + minus
+ + =
The optimal desired force 1
22 21( ) ( )d rF t H H e tminus= minus
( ) ( ) ( ) ( ) ( )= = T
r d s rQ e t F t V e t X HX
IRL approximation
The discounted cost function
( ) ( )
( )
( )( ) ( ) ( )
where ( ) ( ) ( ) ( ) ( ) ( )
t
s r r dt
T T
r d r r d d
V e t r e F e d
r e F e Se F RF
minus minus=
= +
Implementation procedure
1 2 3 4 5 6 7 8
( ) ( )=d e rF t e t
online soliton using the off-policy RL to
find the solution of the given new ARE
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 33Presenter Cong Phat Vo
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
The main objective of this work is to determine the optimal desired force
Fd which is the minimum force that minimizes the position error er
LQR solution RL solution
minimize temporal difference error
The approximator of the Q-value function
( )( ) ( )1
ˆ N
r d i r d i
i
Q e F e F =
=
The parameter update law
11( ) exp
2
T
i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus
( )
( ) ( ) ( ) ( )
ˆ ( ) ( ) ( )1( ) ( )
( )
=
= minus +
r d
t t t t
Q e t F t tt t
t t
IRL approximation
Q-function update law based on IRL
( )
( )
( )( ) ( ) ( )
( ) ( )
t tt
r dt
t
r d
Q e t F t r e d
e Q e t t F t t
+
minus minus
minus
=
+ + +
( ) ( ) ( ) ( ) ( )T T
r r d dr t e t Se t F t RF t= +
The reward function of the utility function
Implementation procedure
1 2 3 4 5 6 7 8
1
( )( )
( )
i r di r d N
i r di
e Fe F
e F
=
=
( ) ( )=d e rF t e t
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 34Presenter Cong Phat Vo
Fig 51 RL in unknown environments during the
learning process
Fig 52 Learning curves
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
κe = 00025 S = 1 R = 10 T = 0001 A = minus4006
B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099
K = 10 β = 01
LQR solution
RL aproximation
PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05
ρ2 = 20 v = 911 α = 57
EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm
xd = 39917 mm xe = 35 mm
Parameters
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 35Presenter Cong Phat Vo
Fig 53 Desired forceposition tracking response
5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT
Three steps are required to complete
the experiment
1-the identification of the environment
2-obtain the ARE solution using the
environmental estimation
3-the complete experiment for the
positionforce controller
IRL approximation
Achieving near-optimal performance without
knowledge of the environment dynamics
LQR solution
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 36Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Force Sensorless Reflecting Control for BHTS
Conclusion and Future works
Introduction
7
4
5
C
H
A
P
T
E
R
1
7
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Optimized Human-Robot Interaction force control using IRL6
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 37Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
3 The unknown human dynamics as well as the optimal overall
human-robot system efficiency is verified in experimental trials
2
Find the optimal parameters of the impedance model to
assure motion tracking and also assist the human with
minimum effort to perform the task in real time
The stability of the closed-loop control is theoretically
demonstrated
1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 38Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
( )1 20
ˆ ( )
ˆ (|| || )
= + + +
+ + minus
t
s h h h
q B h h
K e e e d
K Z Z K
The control torque in task space
1 2
1 2 1
2 1 2
ˆ ˆˆ ( ) ( )
ˆ ˆˆ ˆ ˆ || ||
ˆˆ ˆ ˆ( ) || ||
=
= minus minus = minus
T T
T T T
h h h
T
h h
q W W q
W F F W q kF W
W Gq W kG W
2
h hm
d d d
K
M s C s K
=
+ +
An approximation of the continuous functionThe prescribed
impedance model
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
Control design
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 39Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Task-specific Outer-loopRobot-specific Inner-loop
1
1
[ ]
minus
minus
= +
= minus
=
=
h h h h d
h e
h e p d
T T T
d d d
A B e
A k
B k
e e e
( )1
+=
+
p d
e
sG s
k s
The human transfer function [99] Purpose τh rarr ed
0
0
= + = +
= minus
q d qd e
e
h h hh e
A e Be X AX Buu
B A u KX
A new augmented state
Prescribed
Impedance Model
Haptic interface
based-APAM
Torque
control lawHuman
Operator
Optimal Law-based
IRL
Kh+-
NN functional
approximation
1
s
θ +
-
τ θ d θ m eh
τ h
ξ
ed
IRL based on LQR solution
Control design
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 40Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
The performance index
( )T T T
m d d d h h h e et
J e Q e Q u Ru d
= + +
10 T TA P PA Q PB R BPminus= + + minus
The algebraic Riccati equation [83]
0
0
q d qd
e
h h hh
e
A e BeX u
B A
u KX
= = +
= minus
A new augmented state
IRL solutionRL solution
An optimal feeback control
To minimize
solve
The infinite-horizon quadratic cost
Control design
The IRL Bellman equation for time interval ΔT gt 0
Proof of convergence See in Appendix 62
1 2 3 4 5 6 7 8
( ) 1
( )
arg min ( ) ( )e
T
e eu t
u V t X t u R B PX
minus
= = minus
( )7 7( ( )) ( ) ( ) ( ( ))t T
T T T
tV X t X Q K RK X d V X t T
+
= + minus +
( )7 ( ( )) ( ) ( )T T T T
tV X t X Q K RK X d X PX
= + =
( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +
where P is the real symmetric positive definite solution
( )( ) ( ) = ( ) ( )
( ) ( )
t tT T T T
t
T
X t PX t X Q K RK X d
X t t PX t t
+
+
+ + +
The IRL Bellman equation can be written as
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 41Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 61 Continuous-time linear policy iteration algorithm
1 Start with an admissible control input
0 0
1= minuseu K X
2 Policy evaluation given a control policy ui find the
Pi using off-policy Bellman equation
1 1
1
+ minus= minusi T i
eu R B P X
3 Policy Improvement update the control policy
Online implementation
Control design
1 2 3 4 5 6 7 8
( )( )( ) ( )= ( ) ( )
( ) ( )
t t TT i T i i T
t
T i
X t P X t X Q K RK X d
X t t P X t t
+
+
+ + +
Begin
1 2
1 2
1
( ) ( )
( ) ( ) ( )
( )
N i i i
Ti i N i
i T
t t t
K K K
p
minus
= = minus +
=
=
Solving for the cost using least squares
1|| ||i ip p minusminus
0
1 10 1 ( ) 0P i K A BK= = minus
Initialization
1 1i T i
eu R B P Xminus minus= minus
Policy update
End
No
Yes
1i irarr +
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 42Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 62 The tracking error of the trajectory of the robot system and
the prescribed impedance model
Fig 63 Reference trajectory versus the state of the robot systems
during and after learning
κe = 00025 S = 1 R = 10 T = 0001
A = minus4006 B = minus0002
υ = 09 γ = 09 δΔ = 065 ε = 01
ε-decay = 099 K = 10 β = 01
LQR solution
RL aproximation
Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01
= 100 κp = 20 κd = 10 and ke = 018 s
θd = 10sin(02πt) Md = 1 Kg Bd = 718
Nsmm-1 Kd =1576 Nmm-1 Kh = 0046
Parameters
Z
Simulation
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 43Presenter Cong Phat Vo
6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL
Fig 64 The output trajectory and impedance model in the inner loop
Fig 65 The desired trajectory and output trajectory the outer loop
Fig 66 Interaction force
Perfectly performance of the human-
robot cooperation
A little deviation between the desired
and actual trajectory
The human interaction force is reduced
after learning and the optimal set of the
prescribed impedance model is found by
the outer-loop controller
Experiment
Validation results
Discussion
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 44Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Conclusion and Future works
Introduction
7
4
5
6
C
H
A
P
T
E
R
1
8
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Force Sensorless Reflecting Control for BHTS7
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 45Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
3 The effectiveness of the proposed control method is verified in
the different working conditions
2The stability and finite-time convergence characteristic of the closed-
loop control are theoretically analyzed by the Lyapunov approach
The first time the hybrid control based
on a fast finite-time NTSMC and AFOB1
Proposed control idea
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 46Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
( )1 2id = minus +
1 1
2
= minus + = minus +
id
T
i i id i
T
i ia = minus
The external torque estimation
[ ]
1 2ˆ( ) = + + minus minus minus +iv
i i ir i i i i i i i i i i idM C K sign s s s
( )i i i is = +
[ ]
2
1 2
if 0 or 0 and ( )
( ) if 0 and
i
i i i i i i
i i
i i i i i i i i
s s
z z sign s
= =
+
An adaptive FSOB gain
where
where
The torque control design by style-FNTSMC
The modified FNTSM surface design
Master Slave
Environment
APAM Slave
system
Slave
Controller
Estimator
Oper
ator
Master
Controller
APAM Master
system
Estimator
Impendence
filter
+
++
-
+
-
+
+md
m s
s
sd
ˆsdˆmd
mf
m
d
Control design
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 47Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 71 Torque estimation performances
λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911
εi = 01 ρ1i = 10 ρ2i = 2
Proposed control
PID control KP = 38 KI = 25 KD = 015
Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads
for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01
a = 10
RMSEs associated with NDO RTOB and proposed algorithm
are calculated of 0211 0192 and 0076 respectively
Parameters
Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)
- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1
stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 48Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Fig 72 Position tracking performances
Fig 72 Position tracking error performances
Fig 72 Transparency performance in ldquosoftrdquo
Fig 72 Transparency error in soft contract
Fig 72 Transparency performance in ldquostiffrdquo
Fig 72 Transparency error in stiff contract
Validation results
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 49Presenter Cong Phat Vo
7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS
Finite-time convergence of the
tracking errors
Eliminating the uncertainties the
noise effect in the obtained force
sensing via new AFOB
Robust performance against uncertainties
and disturbances in the system dynamics
Stability of the overall system and the
transparency performance can be
attained simultaneously
Fast transient response
Summary
The great transparency performance and global stability of the BHTS under the proposed
control scheme are achieved despite the uncertainties and in different working conditions
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 50Presenter Cong Phat Vo
CONTENTS
Adaptive Finite-Time Force Sensorless Control scheme
Model-free control for Optimal contact force in Unknown Environment
Optimized Human-Robot Interaction force control using IRL
Force Sensorless Reflecting Control for BHTS
Introduction
4
5
6
C
H
A
P
T
E
R
1
7
Modeling and platform setup based on the APAM configuration2
Adaptive Gain ITSMC with TDE scheme3
Conclusion and Future works78
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 51Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Verified an effective position tracking controller via the AITSMC with TDE technique under
various working conditions (Its uncertainties and disturbance are handled)
Validated the significant effectiveness of finite-time force tracking problems based on the
FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)
Based on the online learning ability of the RL approximation method the model-free controller
for optimal contact force ensures great position and force tracking performance
Optimized a prescribed impedance model parameters by the IRL approach to provide little
operator information in yielding highly effective position control purpose
Achieved good transparency performance with both force feedback and position tracking
simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes
Conclutions
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 52Presenter Cong Phat Vo
8 CONCLUSION AND FUTURE WORKS
Future works
To improve the control performance in joint-space the robot system model can be
improved by robust identification approaches
Effectiveness of the proposed algorithm can be further increased by developing an automatic
tune method for both the estimation and control gains
Apply optimized strategies on both master and slave sides Note also that the computational
complexity should be reduced by the data size
Based on the developed prototype in this thesis it is possible to develop a high-quality
commercial teleoperation device in the rehabilitation field with a safer solution
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8
20-May-21 | 53
A PRESENTATION OF DOCTORAL DISSERTATION
Presenter Cong Phat Vo
Department of Mechanical Engineering University of Ulsan Korea
Thank you for attending
1 2 3 4 5 6 7 8