a presentation of doctoral dissertation

53
20 - May - 21 | 1 A PRESENTATION OF DOCTORAL DISSERTATION Presenter : Cong Phat Vo Department of Mechanical Engineering, University of Ulsan, Korea Ph. D candidate: Cong Phat Vo Advisor: Professor Kyoung Kwan Ahn Title: A Sensorless Reflecting Control for Bilateral Haptic Teleoperation System based on Pneumatic Artificial Muscle Actuators Fluid Power Control and Machine Intelligence Laboratory

Upload: others

Post on 01-Oct-2021

2 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 1

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Ph D candidate Cong Phat Vo

Advisor Professor Kyoung Kwan Ahn

Title

A Sensorless Reflecting Control for

Bilateral Haptic Teleoperation System based on

Pneumatic Artificial Muscle Actuators

Fluid Power Control and Machine Intelligence Laboratory

20-May-21 | 2Presenter Cong Phat Vo

CONTENTS

Modeling and platform setup based on a APAM configuration

Adaptive Gain ITSMC with TDE scheme for the position control

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

2

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

20-May-21 | 3Presenter Cong Phat Vo

CONTENTS

Modeling and platform setup based on the APAM configuration

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

2

7

3

4

5

6

C

H

A

P

T

E

R 7

8

Introduction1

1 2 3 4 5 6 7 8

20-May-21 | 4Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe controllers are required to be robustly stable with respect to a

specified set of uncertainties introduced by the different components

(operator environment communication channel as well as sensors

introduce uncertainties)

20-May-21 | 5Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe main purpose is to provide technical means to successfully perform a

desired task in the environment (achieved a high task performance)

For evaluation physically accessible quantities (task completion time

measurement error applied forces or dissipated energies)

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 2: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 2Presenter Cong Phat Vo

CONTENTS

Modeling and platform setup based on a APAM configuration

Adaptive Gain ITSMC with TDE scheme for the position control

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

2

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

20-May-21 | 3Presenter Cong Phat Vo

CONTENTS

Modeling and platform setup based on the APAM configuration

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

2

7

3

4

5

6

C

H

A

P

T

E

R 7

8

Introduction1

1 2 3 4 5 6 7 8

20-May-21 | 4Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe controllers are required to be robustly stable with respect to a

specified set of uncertainties introduced by the different components

(operator environment communication channel as well as sensors

introduce uncertainties)

20-May-21 | 5Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe main purpose is to provide technical means to successfully perform a

desired task in the environment (achieved a high task performance)

For evaluation physically accessible quantities (task completion time

measurement error applied forces or dissipated energies)

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 3: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 3Presenter Cong Phat Vo

CONTENTS

Modeling and platform setup based on the APAM configuration

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

2

7

3

4

5

6

C

H

A

P

T

E

R 7

8

Introduction1

1 2 3 4 5 6 7 8

20-May-21 | 4Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe controllers are required to be robustly stable with respect to a

specified set of uncertainties introduced by the different components

(operator environment communication channel as well as sensors

introduce uncertainties)

20-May-21 | 5Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe main purpose is to provide technical means to successfully perform a

desired task in the environment (achieved a high task performance)

For evaluation physically accessible quantities (task completion time

measurement error applied forces or dissipated energies)

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 4: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 4Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe controllers are required to be robustly stable with respect to a

specified set of uncertainties introduced by the different components

(operator environment communication channel as well as sensors

introduce uncertainties)

20-May-21 | 5Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe main purpose is to provide technical means to successfully perform a

desired task in the environment (achieved a high task performance)

For evaluation physically accessible quantities (task completion time

measurement error applied forces or dissipated energies)

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 5: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 5Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessThe main purpose is to provide technical means to successfully perform a

desired task in the environment (achieved a high task performance)

For evaluation physically accessible quantities (task completion time

measurement error applied forces or dissipated energies)

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 6: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 6Presenter Cong Phat Vo

hellip

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

Robustness

PerformenceIt refers to the operatorrsquos feeling of being there Ideally the operator can

distinguish a remote environment that presence is correlated with task

performance in a positive causal way It implies that by improving the feeling

of presence task performance is improved as well

Perception

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 7: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 7Presenter Cong Phat Vo

hellip

Performence

Perception

Transparency

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

1 2 3 4 5 6 7 8

RobustnessCompared to perception it is a quantitative objective Transparency and

robustness are conflicting objectives (trade-off) A measure for

transparency is fidelity It describes the ability to exactly display the

environment to the operator

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 8: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 8Presenter Cong Phat Vo

1 INTRODUCTION

Overview of the BHTS structure

Slave

control

Master

control

Co

mm

un

icatio

n c

ha

nn

el

Monitor

Operator

Master

Haptic interface

Slave

teleoperator

Remote

environment

Vision

sensor

hf

m

m

x

x

sdw

s

s

x

x

efmdwm

m

x

x

Figure 11 The bilateral haptic teleoperation architecture

Variable Impedance Control [5 6]

Model-mediated Teleoperation [40]

Movement Estimation [78 79]

Passivity [27] Prediction [39] hellip

Robustness

Performence

Perception

Transparency

Environment condition

Operatorrsquos characteristics

Communication channel

1 2 3 4 5 6 7 8

classify

evaluate

achieve

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 9: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 9Presenter Cong Phat Vo

1 INTRODUCTION

PAM actuator

Fig 12 (a) Structure and (b) working principle of the PAM

KNEXO robot [22] Wearable Haptic Device [23]2-joint manipulator [16 17] Forcepsrsquo manipulators [20]

Rubber hose Fitting connectionLoose-weave

fibersUn-Pressured state

Pressured state

Stroke

(a) (b)

Rubber hoseLoose-weave

fibers

Fitting

connection

1 2 3 4 5 6 7 8

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 10: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 10Presenter Cong Phat Vo

1 INTRODUCTION

Research Objectives

Propose Position

Tracking Controller

An integral terminal-style SMC is designed with the combination of an adaptive gain and TDE technique

0102

Propose Model-free Control for

slave-environment interaction03

Propose Force Sensorless refleting

control for bilateral haptic teleoperation

Propose Finite-Time

Force Controller

An adaptive gain Fast ITSMC-TDE scheme is established with a friction-free disturbance technique in finite-time

Using IRL to optimize the prescribed impedance model parameters to assist minimum human effort

Propose optimal impedance for

human-master interaction

05

04Applying RL approximation method to learning the optimal contact force (minimal) online

The new AFOB is proposed to estimate the force signalsAchieving great transparency (position force feedback)

1 2 3 4 5 6 7 8

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 11: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 11Presenter Cong Phat Vo

CONTENTS

Adaptive Gain ITSMC with TDE scheme

Adaptive Finite-Time Force Sensorless Control scheme with TDE

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

3

4

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 12: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 12Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Mathematical modeling

n turns

nπD

bL

1 t

urn

s

D

2 2

2

3

4

dV dV d b LF P P P

dL dL d n

minus= minus = minus = minus

The tensile force of each PAM [7]

Fig 21 A cross-sectional view of the simplified geometric model

The motion equation of the system

1 2Mx Bx F F+ = minus +

Parameter Description Value UnitPs Supply pressure 65 bar

B Viscous damping coefficient 50 Nsmm

M Mass of the plate 05 Kg

L0 Original length of the PAM 175 mm

n Number of turns of the thread 104

b Thread length 200 mm

Table 21 Parameters of the studied PAM

The motion equation of the system

0 0( ) ( )x f x x g x u d= + +

where f0(x) g0(x) are the nominal value of f(x)

g(x) d is a total of unknown disturbances

( )2 2 2

00 0

2 2

0 0

33 ( ) ( )

2

( ( ) ( )) ( ( ) ( ))

L x bP L x Bxf x g x

n M M n M

d f x x f x x g x g x u

+ minus = minus minus = = minus + minus +

1 2 3 4 5 6 7 8

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 13: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 13Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Experimental modeling

The approximated PAM model based on its

pressure and contraction strain

1 0

1 0

1 0

( ) ( ) ( )

( )

( ) ( 1 2)

( )

k k k k k k k k k k

k k k k k

k k k k k

k k k k k

M b P k P f P

b P b P b

k P k P k k

f P f P f

+ + =

= +

= + = = +

Parameter Value Units

Spring

elements

kk1 11732 Nmbar

kk0 68 Nm

Damping

elements

bk1 962 Nmsbar

bk0 085 Nms

Force

elements

fk1 9122 Nbar

fk0 473 N

0-100

6

0

100

55

200

Fo

rce

[N

]

Contraction length

[mm]

300

400

4

Pressure [bar]

500

103 2 151

Fig 23 Experimental model for certain separate pressure

Table 22 List of APAM model parameters

Load

PAM 1 PAM 2Plate

Linear encoder

Proportional pressure

regulator valves

Card

PCI-1711

Card

PCI-QUAD04

Pressure signal

Control signal

Position signal

Pulley

Fig 22 Block diagram of the experimental system

1 2 3 4 5 6 7 8

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 14: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 14Presenter Cong Phat Vo

2 MODELING AND PLATFORM SETUP BASED ON THE APAM CONFIGURATION

Platform setup

Fig 25 The APAM-based (a) master and (b) slave subsystem configuration

Proportional

pressure

regulator

valves

Optical

encoder

Control stick

Loadcells

Pneumatic

Muscle

Actuators Interactive

Loadcell

Pulley

Front viewBack view(a)

Proportional valve

Loadcell DamperSpring

Environment

APAM system

PAMPulley

PositionSignal

ControlSignal

StiffnessDamping

Contact Force Signal

PCPCI Cards

Control box

(b)

1 2 3 4 5 6 7 8

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 15: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 15Presenter Cong Phat Vo

2 MODELING OF THE BILATERAL TELEOPERATION SYSTEM

Slave robot

Environment

Master robot

Control stick

Device Description

PAM

Festo MAS-10-200N-AA-MC-O

Nominal length 200 mm

Inside diameter 10 mm

Max pressure 8 bar

PPR ValveFesto VPPM-8L-L-1-G14-0L8H

Max pressure 8 bar

DAQ Card

ADVANTECH PCI-1711

AIAO 12 bits (resolution)

MEASUREMENT PCI-QUAD04

Linear

encoder

Type Rational WTB5-0500MM

Resolution 0005 mm

Table 23 Specifications of the exp devices

Fig 26 Photograph of the experimental apparatus of the overview BHTS

Platform setup

1 2 3 4 5 6 7 8

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 16: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 16Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

7

8

Adaptive Gain ITSMC with TDE scheme for the position control3

Modeling and platform setup based on the APAM configuration2

1 2 3 4 5 6 7 8

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 17: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 17Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Proposed control idea

3The control effectiveness is verified by experimental trials in

different challenging work conditions (compare several controllers)

2

The stability of the controlled system is theoretically analyzed

The first time the AITSMC-TDE scheme

is proposed for a PAM system1

1 2 3 4 5 6 7 8

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 18: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 18Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

2 1 1 2 1s e k e k e= + +

( ) ( ) (( ) ( ) )

0

1

0 1 2 2 1 2

0 1 2 0 1 1

t

t

n d

s t s t k e k e e

f x x g x u x d

minus= minus minus +

+ + minus

( ) ( )( ) ( ) 31

0 1 3 3 4ˆ

su t g x sign

minus = minus + +

( )( ) ( )

( )3 3 0

3

ˆ ˆ1ˆ or

( )mt t

tsign t

= +=

+ =

A terminal sliding surface is chosen

A proposed integral sliding surface is designed

An adaptive law for the robust gain

A robust control signal can be redesigned

Proof See more detail in Appendix 32 33

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Control design

1 2 3 4 5 6 7 8

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 19: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 19Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

( ) ( ) ( )( )( )( ) ( )

2 0 1 2

0 1

ˆ

d d d

d d

d x t t f x t t x t t

g x t t u t t

= minus minus minus minus

minus minus minus

( ) ( )1

n su t u u g x dminus

= + minus

The control law can be rewritten

where

Proof See more detail in Appendix 31 34-35

Control design

1 1 1 1

2 2 1

de x x

x

= = minus = minus

( )( ) ( )1 2

1

0 1 1 1 0 1 2

1

1 1 1 1 1 2 2 2 2

n du g x x f x x

minus

minus

= minus minus

minus minus minus minus

The auxiliary variables are defined

The nominal control signal can be designed as

1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

1 2 3 4 5 6 7 8

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 20: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 20Presenter Cong Phat Vo

Parameters

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Experimental validation

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 α = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 31 Performance response in case study 1

Fig 32 Performance response in case study 2

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

1 2 3 4 5 6 7 8

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 21: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 21Presenter Cong Phat Vo

Fig 33 Performance response in case study 3

Fig 34 Performance response in case study 4

Controller PID ITSMC Proposed

Sine 02 Hz ndash 8 mmL2[e] 04102 01737 01574

L2[u] 09961 10270 09602

Sine 1 Hz ndash 8 mmL2[e] 11845 04996 03788

L2[u] 11284 11152 10900

MultistepL2[e] 08587 06846 06368

L2[u] 07670 07333 07247

Sine 1 Hz ndash 8 mm

with 2-kg-load

L2[e] 12578 08314 04779

L2[u] 11414 11506 11287

Sine 1 Hz ndash 8 mm

with 7-kg-load

L2[e] 15937 10827 05067

L2[u] 11613 11729 11584

Table 31 A quantitative comparison between the tracking performances

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Case study 1 (no load) xd (mm) = 8sin(04πt)

Case study 2 (no load) xd (mm) = 8sin(πt)

Case study 3 (no load) A multistep trajectory

Case study 4 (with load) xd (mm) = 8sin(πt)

in turn 2-kg-load and 7-kg-load

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 22: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 22Presenter Cong Phat Vo

3 AN ADAPTIVE GAIN ITSMC-TDE SCHEME FOR THE POSITION CONTROL

Summary

Robust performance against uncertainties and disturbances1 2d dx x

Time-delay

Estimator

The PAM system

Control law

Robust

control signal

Adaptation

law

Nominal

control signal

Transfer

Coordinate Integral Terminal

Sliding Surface

Terminal

Sliding Surface

Reference

Trajectory

d

The proposed

control approach

u

1 2x x

su

1 2x x

s

1

2

nu

Finite-time convergence of the tracking errors

Reducing the chattering phenomenon

Eliminating the reaching phase

Fast transient response

Next works will be directed to the control using the force properties

1 2 3 4 5 6 7 8

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 23: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 23Presenter Cong Phat Vo

CONTENTS

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

5

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Adaptive Finite-Time Force Sensorless Control scheme4

1 2 3 4 5 6 7 8

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 24: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 24Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

3 Verified the reliability and superiority of the proposed control

demonstrated in different challenging work conditions

2

The FITSMC guarantees the finite convergence new adaptive gain and TDE

are deployed to handle online (uncertainties and chattering phenomenon)

The proposed FSOB scheme obtains

force information eliminated noise and

static friction

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 25: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 25Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

( )[ ]

1 20

( ) ( ( )) = + +t

v

f f fe e sign e d

( ) ( )[ 1]5

5

4

ˆ

+

minus= minus

v

signsign

1 1 [ ]

1 2

[ ]

4 5

ˆ( ) ( ) ( )

ˆ ( )

v

c d f f

v

u t d t K e sign e

sign

minus minus= + + +

+ +

1ˆ ˆ ˆ( ) ( ) ( ) ( )minus minus = minus minus minusd d c e dd t d t t u t t K t t

The FITSM surface is defined

The control law can be rewritten

2

2

ˆ ˆ ˆ2ˆ

+ += e e e e e e

e

e

An adaptive switching gain law An estimated lumped disturbance Force observer based-LeD [40]

APAM system

Force based

Time-delay

Estimator

Integral Terminal

Sliding SurfaceControl law

Adaptive law

Environment

Configuration

Reference

Trajectoryu

d

ˆe

d

+

+

The proposed control

approach

2k

Plant + Environment

Force Observer

e

Proof See in Appendix 41 ndash 43

Control design

1 2 3 4 5 6 7 8

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 26: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 26Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

for ITSMC η1 = 3 η2 = 15 η3 = 25 κ1 = κ3 = 10

κ2 = κ4 = 2 γ1 = γ2 = 57 k1 = 2 k2 = 55 v = 911

for adaptative gain μ = 10 δ = 005 σ = 01

for case study 3 KP = 15 KI = 18 KD = 005

for other case KP = 35 KI = 28 KD = 01

Proposed control

PID control

Fig 42 Performance response in Scenario 2

Scenario 1 τd (N) = 75sin(02πt- π2)+625

Scenario 2 τd (N) = 75sin(08πt- π2)+625

Scenario 3 A multistep trajectory with a LPFrsquos bandwidth

of 02(rads)

Observersfor FSOB ω = 500 rads ξ = 0001 α1 = 20 α2 = 12

α3 = 5 α4 = 15

Fig 42 Performance response in Scenario 1

Parameters

Experimental validation

1 2 3 4 5 6 7 8

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 27: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 27Presenter Cong Phat Vo

4 ADAPTIVE FINITE-TIME FORCE SENSORLESS CONTROL SCHEME

Fig 43 Performance response in Scenario 3

Fig 44 Control gain

Controller PID ISMC FITSMC-TDE Proposed

Sine 01 Hz 21050 12152 09440 05158

Sine 04 Hz 32054 18045 20632 19807

Multistep 26508 23548 20618 17685

Table 41 A RMSE comparison between the tracking error performances

Experimental validation

- Improved the control performances (fast response high

accuracy and robustness)

- Guaranteed the finite convergence

- Cancelling the lumped uncertainties via TDE solution and

the switching gain in the FITSMC online

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 28: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 28Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

6

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Model-free control for Optimal contact force in Unknown Environment5

1 2 3 4 5 6 7 8

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 29: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 29Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

3 Verified effective for robot interaction control in

unknown environments

2 The stability of the closed-loop control is theoretically demonstrated

by the Lyapunov approach

The RL method is used to learn on-line

the optimal desired force1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 30: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 30Presenter Cong Phat Vo

( ) ( ) ( ( ))= minus + minuse e e eF t C x t K x x t

( ( )) ( )r dZ e t F t= [ ]

1 2

( ) ( ) ( )

( ) ( )

= + +

minus minus minus +

r r r

v

e

Me t Ce t Ke t

sign F t

The force control

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The environment is modeled as

a spring-damper system [81]

The optimal impedance model The position control in task space

( ) ( ) ( )

= + f Pf f If f

t

x s K e t K e d

Task-specific Outer-loopRobot-specific Inner-loop

Precise position

control

Prescribed

Impedance Model

Environment

Configuration

APAM system

xx

d

Force controlx

rx

fe

r

Fe

Fd

ef

-

+ -+ -

+

RL-based

approximation

ϕ

Control design

1 2 3 4 5 6 7 8

( ) ( ) ( )r r de t Ae t BF t= +

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 31: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 31Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

LQR solution

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

( ) ( )=d e rF t e t

The discounted cost function

( ) ( )( ) ( ) ( ) ( ) ( ) ( )

= minus +T T

s r d r r d dt

J e t F t e Se F RF d

The Algebraic Riccati equation

0+ + + =T T

eA PA S P A PB

The feedback control gain

( )1

1 minus

minus= minus +T T

e B PB R B PA

IRL approximationRL solution

Implementation procedure

1 2 3 4 5 6 7 8

offline soliton

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 32: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 32Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

impossible for all time

The Hamilton-Jacobi-Bellman equation

( ) ( ) ( )min ( ) ( ) ( ) 0

minus minus + minus =

d

t

s rtF

V e t r t r e dt

The new Algebraic Riccati equation

( )( ) ( )

2 ( ) ( ) ( ) ( ) 0

T T

r r

T T

r d d d

e t A P PA S P e t

e t PBF t F t RF t

+ + minus

+ + =

The optimal desired force 1

22 21( ) ( )d rF t H H e tminus= minus

( ) ( ) ( ) ( ) ( )= = T

r d s rQ e t F t V e t X HX

IRL approximation

The discounted cost function

( ) ( )

( )

( )( ) ( ) ( )

where ( ) ( ) ( ) ( ) ( ) ( )

t

s r r dt

T T

r d r r d d

V e t r e F e d

r e F e Se F RF

minus minus=

= +

Implementation procedure

1 2 3 4 5 6 7 8

( ) ( )=d e rF t e t

online soliton using the off-policy RL to

find the solution of the given new ARE

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 33: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 33Presenter Cong Phat Vo

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

The main objective of this work is to determine the optimal desired force

Fd which is the minimum force that minimizes the position error er

LQR solution RL solution

minimize temporal difference error

The approximator of the Q-value function

( )( ) ( )1

ˆ N

r d i r d i

i

Q e F e F =

=

The parameter update law

11( ) exp

2

T

i r d r i d i i r i d ie F e c F c e c F c minus = minus minus minus minus minus

( )

( ) ( ) ( ) ( )

ˆ ( ) ( ) ( )1( ) ( )

( )

=

= minus +

r d

t t t t

Q e t F t tt t

t t

IRL approximation

Q-function update law based on IRL

( )

( )

( )( ) ( ) ( )

( ) ( )

t tt

r dt

t

r d

Q e t F t r e d

e Q e t t F t t

+

minus minus

minus

=

+ + +

( ) ( ) ( ) ( ) ( )T T

r r d dr t e t Se t F t RF t= +

The reward function of the utility function

Implementation procedure

1 2 3 4 5 6 7 8

1

( )( )

( )

i r di r d N

i r di

e Fe F

e F

=

=

( ) ( )=d e rF t e t

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 34: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 34Presenter Cong Phat Vo

Fig 51 RL in unknown environments during the

learning process

Fig 52 Learning curves

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

κe = 00025 S = 1 R = 10 T = 0001 A = minus4006

B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01 ε-decay = 099

K = 10 β = 01

LQR solution

RL aproximation

PositionForce controlKPf = 0001 KIf = 002 λ = 02 ηΔ = 01 ρ1 = 05

ρ2 = 20 v = 911 α = 57

EnvironmentCe = 05 Nsmm-1 Ke = 2 N mm-1 r0 = 165 mm

xd = 39917 mm xe = 35 mm

Parameters

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 35: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 35Presenter Cong Phat Vo

Fig 53 Desired forceposition tracking response

5 MODEL-FREE CONTROL FOR OPTIMAL CONTACT FORCE IN UNKNOWN ENVIRONMENT

Three steps are required to complete

the experiment

1-the identification of the environment

2-obtain the ARE solution using the

environmental estimation

3-the complete experiment for the

positionforce controller

IRL approximation

Achieving near-optimal performance without

knowledge of the environment dynamics

LQR solution

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 36: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 36Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Force Sensorless Reflecting Control for BHTS

Conclusion and Future works

Introduction

7

4

5

C

H

A

P

T

E

R

1

7

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Optimized Human-Robot Interaction force control using IRL6

1 2 3 4 5 6 7 8

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 37: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 37Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

3 The unknown human dynamics as well as the optimal overall

human-robot system efficiency is verified in experimental trials

2

Find the optimal parameters of the impedance model to

assure motion tracking and also assist the human with

minimum effort to perform the task in real time

The stability of the closed-loop control is theoretically

demonstrated

1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 38: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 38Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

( )1 20

ˆ ( )

ˆ (|| || )

= + + +

+ + minus

t

s h h h

q B h h

K e e e d

K Z Z K

The control torque in task space

1 2

1 2 1

2 1 2

ˆ ˆˆ ( ) ( )

ˆ ˆˆ ˆ ˆ || ||

ˆˆ ˆ ˆ( ) || ||

=

= minus minus = minus

T T

T T T

h h h

T

h h

q W W q

W F F W q kF W

W Gq W kG W

2

h hm

d d d

K

M s C s K

=

+ +

An approximation of the continuous functionThe prescribed

impedance model

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

Control design

1 2 3 4 5 6 7 8

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 39: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 39Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Task-specific Outer-loopRobot-specific Inner-loop

1

1

[ ]

minus

minus

= +

= minus

=

=

h h h h d

h e

h e p d

T T T

d d d

A B e

A k

B k

e e e

( )1

+=

+

p d

e

sG s

k s

The human transfer function [99] Purpose τh rarr ed

0

0

= + = +

= minus

q d qd e

e

h h hh e

A e Be X AX Buu

B A u KX

A new augmented state

Prescribed

Impedance Model

Haptic interface

based-APAM

Torque

control lawHuman

Operator

Optimal Law-based

IRL

Kh+-

NN functional

approximation

1

s

θ +

-

τ θ d θ m eh

τ h

ξ

ed

IRL based on LQR solution

Control design

1 2 3 4 5 6 7 8

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 40: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 40Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

The performance index

( )T T T

m d d d h h h e et

J e Q e Q u Ru d

= + +

10 T TA P PA Q PB R BPminus= + + minus

The algebraic Riccati equation [83]

0

0

q d qd

e

h h hh

e

A e BeX u

B A

u KX

= = +

= minus

A new augmented state

IRL solutionRL solution

An optimal feeback control

To minimize

solve

The infinite-horizon quadratic cost

Control design

The IRL Bellman equation for time interval ΔT gt 0

Proof of convergence See in Appendix 62

1 2 3 4 5 6 7 8

( ) 1

( )

arg min ( ) ( )e

T

e eu t

u V t X t u R B PX

minus

= = minus

( )7 7( ( )) ( ) ( ) ( ( ))t T

T T T

tV X t X Q K RK X d V X t T

+

= + minus +

( )7 ( ( )) ( ) ( )T T T T

tV X t X Q K RK X d X PX

= + =

( ) ( ) ( )T TA BK P P A BK Q K RKminus + minus = minus +

where P is the real symmetric positive definite solution

( )( ) ( ) = ( ) ( )

( ) ( )

t tT T T T

t

T

X t PX t X Q K RK X d

X t t PX t t

+

+

+ + +

The IRL Bellman equation can be written as

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 41: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 41Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 61 Continuous-time linear policy iteration algorithm

1 Start with an admissible control input

0 0

1= minuseu K X

2 Policy evaluation given a control policy ui find the

Pi using off-policy Bellman equation

1 1

1

+ minus= minusi T i

eu R B P X

3 Policy Improvement update the control policy

Online implementation

Control design

1 2 3 4 5 6 7 8

( )( )( ) ( )= ( ) ( )

( ) ( )

t t TT i T i i T

t

T i

X t P X t X Q K RK X d

X t t P X t t

+

+

+ + +

Begin

1 2

1 2

1

( ) ( )

( ) ( ) ( )

( )

N i i i

Ti i N i

i T

t t t

K K K

p

minus

= = minus +

=

=

Solving for the cost using least squares

1|| ||i ip p minusminus

0

1 10 1 ( ) 0P i K A BK= = minus

Initialization

1 1i T i

eu R B P Xminus minus= minus

Policy update

End

No

Yes

1i irarr +

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 42: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 42Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 62 The tracking error of the trajectory of the robot system and

the prescribed impedance model

Fig 63 Reference trajectory versus the state of the robot systems

during and after learning

κe = 00025 S = 1 R = 10 T = 0001

A = minus4006 B = minus0002

υ = 09 γ = 09 δΔ = 065 ε = 01

ε-decay = 099 K = 10 β = 01

LQR solution

RL aproximation

Position controlKs = 20 F = 10 G = 1 μ1 = μ2 = 5 k = 01

= 100 κp = 20 κd = 10 and ke = 018 s

θd = 10sin(02πt) Md = 1 Kg Bd = 718

Nsmm-1 Kd =1576 Nmm-1 Kh = 0046

Parameters

Z

Simulation

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 43: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 43Presenter Cong Phat Vo

6 OPTIMIZED HUMAN-ROBOT INTERACTION FORCE CONTROL USING IRL

Fig 64 The output trajectory and impedance model in the inner loop

Fig 65 The desired trajectory and output trajectory the outer loop

Fig 66 Interaction force

Perfectly performance of the human-

robot cooperation

A little deviation between the desired

and actual trajectory

The human interaction force is reduced

after learning and the optimal set of the

prescribed impedance model is found by

the outer-loop controller

Experiment

Validation results

Discussion

1 2 3 4 5 6 7 8

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 44: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 44Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Conclusion and Future works

Introduction

7

4

5

6

C

H

A

P

T

E

R

1

8

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Force Sensorless Reflecting Control for BHTS7

1 2 3 4 5 6 7 8

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 45: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 45Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

3 The effectiveness of the proposed control method is verified in

the different working conditions

2The stability and finite-time convergence characteristic of the closed-

loop control are theoretically analyzed by the Lyapunov approach

The first time the hybrid control based

on a fast finite-time NTSMC and AFOB1

Proposed control idea

1 2 3 4 5 6 7 8

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 46: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 46Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

( )1 2id = minus +

1 1

2

= minus + = minus +

id

T

i i id i

T

i ia = minus

The external torque estimation

[ ]

1 2ˆ( ) = + + minus minus minus +iv

i i ir i i i i i i i i i i idM C K sign s s s

( )i i i is = +

[ ]

2

1 2

if 0 or 0 and ( )

( ) if 0 and

i

i i i i i i

i i

i i i i i i i i

s s

z z sign s

= =

+

An adaptive FSOB gain

where

where

The torque control design by style-FNTSMC

The modified FNTSM surface design

Master Slave

Environment

APAM Slave

system

Slave

Controller

Estimator

Oper

ator

Master

Controller

APAM Master

system

Estimator

Impendence

filter

+

++

-

+

-

+

+md

m s

s

sd

ˆsdˆmd

mf

m

d

Control design

1 2 3 4 5 6 7 8

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 47: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 47Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 71 Torque estimation performances

λi = 8 pi = 5 qi = 7 ςi = 10-4 νi = 911

εi = 01 ρ1i = 10 ρ2i = 2

Proposed control

PID control KP = 38 KI = 25 KD = 015

Observersfor NDO Lh = Le = 10 for RTOB β = 350 rads

for AFOB κi(0) = 01 λi = 10 ηi = 0 m = 01

a = 10

RMSEs associated with NDO RTOB and proposed algorithm

are calculated of 0211 0192 and 0076 respectively

Parameters

Scenarios- The free space situation θd (t) = 5π(1 - 018t)sin(072πt)

- The contact and recovery situation soft environment Ke = 500 Nmm-1 Be = 4 Nsmm-1

stiff environment Ke = 2800 Nmm-1 Be = 20 Nsmm-1

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 48: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 48Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Fig 72 Position tracking performances

Fig 72 Position tracking error performances

Fig 72 Transparency performance in ldquosoftrdquo

Fig 72 Transparency error in soft contract

Fig 72 Transparency performance in ldquostiffrdquo

Fig 72 Transparency error in stiff contract

Validation results

1 2 3 4 5 6 7 8

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 49: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 49Presenter Cong Phat Vo

7 FORCE SENSORLESS REFLECTING CONTROL FOR BHTS

Finite-time convergence of the

tracking errors

Eliminating the uncertainties the

noise effect in the obtained force

sensing via new AFOB

Robust performance against uncertainties

and disturbances in the system dynamics

Stability of the overall system and the

transparency performance can be

attained simultaneously

Fast transient response

Summary

The great transparency performance and global stability of the BHTS under the proposed

control scheme are achieved despite the uncertainties and in different working conditions

1 2 3 4 5 6 7 8

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 50: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 50Presenter Cong Phat Vo

CONTENTS

Adaptive Finite-Time Force Sensorless Control scheme

Model-free control for Optimal contact force in Unknown Environment

Optimized Human-Robot Interaction force control using IRL

Force Sensorless Reflecting Control for BHTS

Introduction

4

5

6

C

H

A

P

T

E

R

1

7

Modeling and platform setup based on the APAM configuration2

Adaptive Gain ITSMC with TDE scheme3

Conclusion and Future works78

1 2 3 4 5 6 7 8

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 51: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 51Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Verified an effective position tracking controller via the AITSMC with TDE technique under

various working conditions (Its uncertainties and disturbance are handled)

Validated the significant effectiveness of finite-time force tracking problems based on the

FITSMC combined with a new adaptive gain and a friction-free disturbance (FSOB)

Based on the online learning ability of the RL approximation method the model-free controller

for optimal contact force ensures great position and force tracking performance

Optimized a prescribed impedance model parameters by the IRL approach to provide little

operator information in yielding highly effective position control purpose

Achieved good transparency performance with both force feedback and position tracking

simultaneously by the new adaptive force estimation combined the separately FNTSMC schemes

Conclutions

1 2 3 4 5 6 7 8

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 52: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 52Presenter Cong Phat Vo

8 CONCLUSION AND FUTURE WORKS

Future works

To improve the control performance in joint-space the robot system model can be

improved by robust identification approaches

Effectiveness of the proposed algorithm can be further increased by developing an automatic

tune method for both the estimation and control gains

Apply optimized strategies on both master and slave sides Note also that the computational

complexity should be reduced by the data size

Based on the developed prototype in this thesis it is possible to develop a high-quality

commercial teleoperation device in the rehabilitation field with a safer solution

1 2 3 4 5 6 7 8

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8

Page 53: A PRESENTATION OF DOCTORAL DISSERTATION

20-May-21 | 53

A PRESENTATION OF DOCTORAL DISSERTATION

Presenter Cong Phat Vo

Department of Mechanical Engineering University of Ulsan Korea

Thank you for attending

1 2 3 4 5 6 7 8