virtual object manipulation with the...
TRANSCRIPT
Virtual Object Manipulation with The Tango
Abstract
In this project we demonstrate the use of a whole-hand input device - The Tango.The device has two kinds of sensors, a grid of pressure sensors and an accelerometer.We aim at real-time tracking of hand grasp-configuration using the measurementsof the pressure sensors to achieve virtual hand manipulation of objects. The systemprocesses the streaming sensor data and maps it onto a prior human hand model todetermine the configuration of the hand of the user. The hand posture along withthe individual finger configurations are tracked in parallel using particle filters. Theposition of the Tango is determined with the help of the accelerometer readings byestimating its attitude using Kalman filters. We create a user-interface with virtualobjects in a 3D World, where we navigate with the Tango and interact with theobjects using all or some of the sensors of the device.
i
Acknowledgments
ii
Contents
Abstract i
Contents iii
List of Figures v
1 Introduction 1
2 Background and Related Work 3
2.1 Whole-hand Input Devices . . . . . . . . . . . . . . . . . . . . . . . . 3
2.1.1 Glove-based interfaces . . . . . . . . . . . . . . . . . . . . . . 3
2.1.2 Optical Tracking based devices . . . . . . . . . . . . . . . . . 4
2.1.3 Force-Feedback devices . . . . . . . . . . . . . . . . . . . . . 5
2.2 Hand Tracking and Kalman Filters . . . . . . . . . . . . . . . . . . . 6
2.2.1 Hand Tracking Techniques . . . . . . . . . . . . . . . . . . . . 6
2.2.2 Kalman Filters . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2.3 Other Probabilistic Approaches for Tracking . . . . . . . . . 9
3 The Tango- Sensor 10
3.1 Device Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
4 Hand Configuration Tracking 14
4.1 Tracking Procedure . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
iii
4.2 Model of the Hand . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.3 Processing the Measurements from the Sensor . . . . . . . . . . . . . 18
4.3.1 Pressure Signal Change Detection . . . . . . . . . . . . . . . 20
4.3.2 Assignment of Fingers . . . . . . . . . . . . . . . . . . . . . . 26
4.3.3 Grasp Validation . . . . . . . . . . . . . . . . . . . . . . . . . 26
4.4 Tracking Hand Configuration with Kalman Filters . . . . . . . . . . 30
4.4.1 Least Squares Estimation . . . . . . . . . . . . . . . . . . . . 36
4.4.2 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
4.5 Tracking with Particle Filters . . . . . . . . . . . . . . . . . . . . . . 39
5 Tango Position and Attitude Estimation 41
5.1 Attitude Estimation under low acceleration . . . . . . . . . . . . . . 41
5.2 Position Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
6 User-Interface to demonstrate the Tango 45
6.1 3D Environment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45
6.2 Scene Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3 Object Manipulation . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
6.3.1 Hand Grasp Tracking . . . . . . . . . . . . . . . . . . . . . . 49
6.3.2 Free-Form Buttons . . . . . . . . . . . . . . . . . . . . . . . . 50
6.3.3 Object Picking . . . . . . . . . . . . . . . . . . . . . . . . . . 51
6.3.4 Force Application . . . . . . . . . . . . . . . . . . . . . . . . . 52
6.3.5 Implementation and Performance . . . . . . . . . . . . . . . . 53
7 Conclusion and Future Work 54
Bibliography 56
iv
List of Figures
3.1 The Tango device . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2 Sensing and communication electronics. . . . . . . . . . . . . . . . . 12
4.1 Hand Configuration Tracking- Schematic Diagram . . . . . . . . . . 15
4.2 Geometric Model of the Hand . . . . . . . . . . . . . . . . . . . . . . 17
4.3 Measurements for the tracking procedure . . . . . . . . . . . . . . . 19
4.4 Measurements for the tracking procedure . . . . . . . . . . . . . . . 20
4.5 Steps for gathering measurments . . . . . . . . . . . . . . . . . . . . 24
4.6 Mean Signal value per Sample Window . . . . . . . . . . . . . . . . . 25
4.7 Change detection Results - Same variance of both . . . . . . . . . . 25
4.8 Change detection Results - Different variance . . . . . . . . . . . . . 26
4.9 Change detection Results - Same variance of both . . . . . . . . . . 29
4.10 Model fitting on the Tango . . . . . . . . . . . . . . . . . . . . . . . 31
4.11 Tracking Results Comparison . . . . . . . . . . . . . . . . . . . . . . 37
6.1 Screen shot of the User Interface . . . . . . . . . . . . . . . . . . . . 46
6.2 Flow Diagram of the User Interface . . . . . . . . . . . . . . . . . . . 47
6.3 Hand Grasp Tracking - Screen Shot . . . . . . . . . . . . . . . . . . . 49
6.4 Free Form Buttons - Screen Shot . . . . . . . . . . . . . . . . . . . . 51
6.5 Object Manipulation - Screen Shot . . . . . . . . . . . . . . . . . . . 52
6.6 Force Application - Screen shot . . . . . . . . . . . . . . . . . . . . . 53
v
Chapter 1
Introduction
With the unprecedented increase in processing power of computers, the Virtual
Reality technology has rapidly improved. As virtual reality strives to imitate the
real world as closely as possible the need for sophisticated object manipulation
techniques that allow users to realistically interact with objects arises. Whole-hand
input devices help to accomplish this task by providing a mechanism for control
of complex tasks that require dextrous manipulation and control of virtual objects.
There are several other applications too that require information from the whole
hand for example systems for sign languages, robotic control and tele-operations
and computer animation. The design of sophisticated whole-hand input devices
have helped addressing such applications. The main advantage of such devices is
the information of the entire hand can be used to accomplish superior tasks that
cannot be achieved with traditional input devices like the keyboard or mouse.
However the difficulty of manipulating objects in a virtual environment in
a natural way remains a significant problem. In this project we will demonstrate
the use of a new Whole-hand input device - The Tango. The design of the Tango
enables the realistic manipulation of objects in the virtual world. It is shaped like a
round ball, which can be manipulated naturally. The device measurements include
pressure distribution of the finger tips and the 3D orientation of the device. The
1
tango can be used to pick up objects in the virtual world, move them around and
apply forces to them in a natural way. To utilize this device as an interface and
hence to enable superior object manipulation, in this project we aim at Real-time
hand grasp recognition and tracking from the measurements of device.
Tracking of the hand grasp configuration is achieved by probabilistic esti-
mation of a hand model with 11 Degrees of Freedom using Kalman Filters. The
measurements are the pressure distribution of the finger-tips in 3D Device coordi-
nates, which leads to the sub-problem of assigning fingers to correct active Tango
pressure taxels. Motion and Tango orientation information from the accelerometer
readings using some techniques [1].
We finally demonstrate its use as an interface in a virtual 3D environment
where the Tango can be used to manipulate objects in different ways using all or
some of the information that has been inferred from its measurements.
The remaining of the thesis is organized as follows: Chapter 2 gives the back-
ground and previous work on Whole-hand input devices, hand tracking techniques
and Kalman filters previously applied to this problem. Chapter 3 Gives details about
the device. Chapter 4 describes the entire process of tracking the grasp configura-
tion starting from the hand model and assignment of finger. Chapter 5 describes
the method we used to infer the motion and orientation of the Tango from the Ac-
celerometer data. Chapter 6 describes the User Interface. Conclusions, limitations
and future work are described in the last chapters.
2
Chapter 2
Background and Related Work
2.1 Whole-hand Input Devices
The motivation of designing whole-hand input devices comes largely from constraints
put by common control devices like the mouse or joy stick in providing input to
computer applications. They do not allow the entire hand information and dexterity
to be applied to the application.
There are a variety of whole-hand input devices designed which will be dis-
cussed in this section. All of the devices gather whole hand information like posi-
tion/configuration using a variety of sensors and pass it on to the application. The
devices can be categorized by the kind of technology they use to track the hand
configuration or the kind of information they pass on to the application.
2.1.1 Glove-based interfaces
Glove-based interfaces are currently the most common whole-hand user interfaces;
examples include the CyberGloveTM [2], DataGloveTM [3] and The Dextrous HandMaster[4]
. These devices indirectly measure the fingertip location by measuring joint angles.
Gloves are used primarily for position tracking of the hand which include location
of the hand, orientation of the palm and finger-joint configuration information. The
3
technology used in each of these various gloves differ, and is briefly mentioned below.
The CyberGlove TM [2] is highly accurate and used largely in systems depending on
precision. It provides 22 joint angle measurements and uses resistive bend-sensing
technology to accurately gather joint-angle information from hand and finger mo-
tions.
The Dataglove (originally developed by VPL Research) uses movement analy-
sis software and fiberoptic sensors to achieve hand configuration determination. It
is a digital glove with movement analysis software for data collection of wrist and
hand motion. It collects data dynamically in 3D space through fiberoptic sensors
(two for each finger), which track the wearer’s movement. The main limitation of
this device is that it streams information at 30 Hz which is insufficient for some
applications.
The Dextrous HandMaster was developed by Authur D Little and Sarcos
developed for the Utah/MIT Dextrous Hand. It is a lightweight aluminium ex-
oskeleton that is attached to the fingers. Unlike the DataGlove, it measures flexure
of 3 joints for each finger via magnet housed within each joint. In total it measures
20 DOF for the whole hand. It is now sold by Exos[4].
Other examples of Gloves include the Powerglove by Mattel for the Nintendo
Home Entertaining System, the Sayre glove by Thomas DeFanti and Daniel Sandin
at the University of Illinois, Chicago and The Digital Data Entry Glove by Bell
Telphone Laboratories. A survey of literature on Data Gloves can be found in [5]
2.1.2 Optical Tracking based devices
Optical trackers use the position of targets in 2D Camera pictures to reconstruct
a point in 3D. The targets can be active LED’s, reflective markers or passive high
contrast patterns. They have a good accuracy and update rate but disadvantages
include requirement of a Line of Sight of the targets, disturbance by noise and loss
of accuracy with distance.
4
An example of an input device which users optical tracking with retro reflec-
tive markers as the target is The Dragonfly which is a pointing input device and is
designed to give the user a variety of grips. It uses Six retro-reflective spheres in
combination with the narrow form of the device to allow for a precise identification
of the device, its position and orientation in the 3D environment.
Computer vision has also been used for hand tracking as well like in The
Model-Based Integration of Visual Cues for hand tracking [6] and Finger tracking
using multiple cameras [7]. In the Model-Based Hand Tracking their approach is
to combine multiple sources of information like edges, flow and shading information
to track the motion of the hand, which is modelled as a base link, with five linked
chains. Their methods uses a single camera to track the hand as opposed to [7]
where they use multiple cameras and use stereo information to track the hand. In
this approach they combine features of stereo data and color to track the 3D position
and orientation of a finger.
2.1.3 Force-Feedback devices
The lack of force feedback has been an important limitation with the above devices.
These have been addressed before by providing active force feedback, for instance,
using the Rutgers Hand Master [8]. The Rutgers Master glove provides force feed-
back to the fingertips in addition serving as a position measuring exoskeleton. The
CyberGraspTM [2] is another device with active force feedback, that fits over a Cy-
berglove and adds resistive force to each finger. In CyberGrasp the Grasping forces
are applied to the fingertips by a network of tendons routed via the light weight
exoskeleton structure.
While active force feedback is valuable for many tasks, we found that passive
force feedback, combined with a tangible object like a ball which fits conveniently
and comfortably in the human hand and provides good affordances for 3D manip-
ulation, provides a significant improvement over devices with no force feedback at
5
all, with greatly reduced complexity. Passive haptics was also found to significantly
enhance immersive virtual environments by [9]. Other devices, such as the Finger-
ball developed at the University of Toronto [10] provide a similar form factor for
grasping and passive force feedback, but do not measure the pressure distribution.
There has also been a significant amount of research on tactile sensing [11]
and [12]. Commercial sensors are available from, for example, Pressure Profile
Systems, Inc. [13] and Xsensor Technology Corp [14]. In general, these tend to be
either small sensor designed for measuring contact at the fingertips (e.g.,[13]), or
large sensors for biomedical applications [14] and are not designed for conforming
to curved objects.
2.2 Hand Tracking and Kalman Filters
2.2.1 Hand Tracking Techniques
At a high level the kind of sensors used to track are
• Inertial trackers using sensors like accelerometer, gyroscopes etc
• Optical trackers - which can be categorized further based on the target tracked
– Video based which track one or more of the following features
∗ Color
∗ Histogram
∗ Optic flow
∗ Stereo information (if using multiple cameras)
– LED’s, Infrared/Reflective markers
• Magnetic trackers
• Mechnical trackers
6
• Acoustic trackers
• Radio trackers.
• Pressure sensors.
Recently Hybrid tracking is becoming increasingly popular where a combination of
some above mentioned sensors are used, especially those which complement each
other. For example vision based tracking combined with inertial tracking performs
well as vision tracking is good under less movement and inertial tracking performs
better when the motion displacement is high.
Some examples of hand tracking methods using some of the above mentioned
sensors can be found in [5, 15]
We saw the use and implementation of some of them in the whole-hand input
devices in the previous section. To accurately track the hand(or any other object),
Probabilistic processes are incorporated to track efficiently. The advantages are it
can combat occlusion and can work in complex environments where the input data
from some sensors might not be accurate. Some of these processes include
• Kalman Filters
• Particle Filters
• Hidden Markov Models
• Neural Networks
• Bayesian Networks.
2.2.2 Kalman Filters
The Kalman Filter is a stochastic computation method, which tries to estimate
the state of a process in a way that minimizes the mean of the squared error.
The Kalman Filtering technique is applied to various kinds of tracking problems
7
independent of the underlying physical measurement. Here we will review some
literature of applications where Kalman based tracking was used to track human
hands. Introduction to Kalman Filtering and Extended Kalman Filtering is in [16]
In the vision field much work has been done to estimate motion and structure
from image sequences alone. Various gesture recognition system are based on hand
tracking using Kalman filtering techniques.
There has been much work in tracking with Probabilistic Kalman based,
using optical sensors. In the early work from [17] the hand pose configuration is
tracked using silhouette matching. The model is fit into the image using Extended
Kalman Filtering. The distribution is modified based on the inequality constraints,
which they introduce as the method of ’Truncating’. [18] introduced Kalman Filters
for tracking in a hierarchical manner. Here the the human dynamics is broken into
various levels for better tracking. At the lowest level pixels grouped based on co-
herent motion and similar features are clustered into blobs using the EM algorithm.
At the next level these blobs are tracked using Kalman based filters based on linear
dynamic models and at the highest level - HMMs are used to represent gestures.
Recent work suggested by Stenger [19] involves a model based hand tracking
system and an Unscented Kalman Filter is used to update its pose. Measurements
were from a single video camera. The Unscented Kalman Filter is an alternate to
Extended Kalman Filters for non-linear processes, where selected sample points are
propagated instead of computation of Jacobians.
Kalman Filters are also used widely when fusion of various sensors are used
for better tracking. Lin et al in [20] uses Extended Kalman filters are used in hybrid
tracking that is the head is tracked using inertial and visual sensors. One EKF is
used to estimate the head motion and another is used to estimate the 3-D locations
of points in the scene. They address the problem of estimating both the camera
motion as well as the observed features, and how measurements from the two kinds
of sensors were fused to achieve better tracking results.
8
Real time hand tracking for augmented desk interface systems using infra
red sensors to deal with changing illumination settings was proposed by [21]. The
trajectories of multiple fingers are predicted using Kalman filters and updated with
measurements from the image data.
2.2.3 Other Probabilistic Approaches for Tracking
Some other examples of hand tracking systems which involve probabilistic algorithms
are CONDENSATION – conditional density propagation introduced by Isard et al
in [22]. Conditional Density Propagation also known as particle filters, follow a
statistical factored sampling algorithm and are used in this approach to track hand
contours in a complex scene. A similar extended approach using annealing was
suggested by Blake et al [23] where modified particle filters are used to track the
whole human body using a model based approach. Other approaches were used in
[24] where they propose a Bayesian framework to combine the knowledge of color
and shape with the observation and in [25] where a neural network based tracking
procedure with motion estimation as features is suggested.
9
Chapter 3
The Tango- Sensor
Details of the device.. Taken from the paper ! Suggestions... changes..
3.1 Device Design
Tango (whose name is derived from the old word “Tangoreception” which means
pertaining to the sensation of touch) is a hand-size object (e.g., shaped like a ball).
There are 256 analog pressure sensors on the device’s surface, and a 3-axis accelerom-
eter within (constructed out of two ADXL 202 chips from Analog Devices). Tango
produces an 8x32 tactual image with 8 bits per taxel (tactile sensor element), at
100 Hz. Data is gathered at 10 ms intervals by the on-board microcontroller, and
transmitted isochronously to the host computer over a high-speed USB cable. The
pressure and acceleration data can be interpreted by the host computer, and used
for user interaction.
See Figure 6.6 for an external view of the device and Figure 3.2 for device
internals.
10
20
40
60
80
100
120
140
160
180
200
220
Figure 3.1: The Tango device
Some important design criteria for the device are:
• The device provides passive force feedback which makes virtual objects more
tangible. One of the difficulties of manipulating 3D objects using devices such
as the CyberGloveTM is that without some form of force feedback, the fingers
close on themselves and do not give a sense of touching a physical object.
• Capacitance sensing is performed using only digital drive signals. There is
no need for demodulation of sensed voltages, as is common in traditional
capacitative pressure measurements [12].
• Since the device is meant to be used as comfortably and easily as a mouse,
it was important to have a self contained design, with all analog circuitry on-
board and only digital signals leaving the device. A high-speed USB interface
to the host computer has been implemented. The device is hot-pluggable. No
separate power supply is needed. The device operates using USB power.
The sensing method is as follows. A matrix of pressure sensors is formed
11
20
40
60
80
100
120
140
160
180
200
220
Figure 3.2: Sensing and communication electronics.
by an outer layer of electrically conductive strips, electrically insulated from one
another, which run perpendicular to an inner layer of electrically conductive strips,
also insulated from one another, with the two layers separated by a compressible
dielectric material (foam rubber). At each intersection point between an outer
and an inner conductive strip, an individual sensor is formed. Pressure applied at a
sensor will cause the dielectric material to compress, thus increasing the capacitance
between the inner and outer conductive strips at that point.
A novel capacitance sensing method is used, using only digital drive signals.
The outer conductive strips, also called driver strips, are driven by low impedance
digital signals which serve to shield the inner conductive strips, also called sensor
strips, from external electric fields. The voltage on the sensor strips is kept in a
measurable range by the use of bias resistors connected from a constant bias voltage
to each of the sensor strips. The differential voltages between the bias voltage
and each of the sensor strip voltages are the quantities measured by the on-board
electronics. Each sensor strip is also connected to two digital drive signals by two
12
separate capacitors of differing, but constant, capacitance. These signals, called
‘Calibrate Low’ and ‘Calibrate High’, are used to isolate a particular sensor from
the effects of all other sensors on the same sensor strip.
There are three prototypes of the device. For communication, a high-speed
USB interface to the host computer has been implemented. No separate power sup-
ply is needed when the device is operating using USB power, the power is sufficient
for all onboard electronics. The device is hot-pluggable. The latest version has a
bluetooth communications interface and onboard battery.
One complication with the design is that the outer driver strips transmit
strain from the point of contact to other sensors connected to the same strip. How-
ever, this is corrected in software by measuring the response to a point load on a
single sensor (we call this the Meridian Green’s Function), and deconvolving the
raw readings with this response. Also, the foam rubber’s deformation response to
pressure in non-linear. But linearity is not important for our application, as we are
primarily interested in detecting finger contacts. The non-linearity makes the sen-
sors sensitive to low pressures and not saturate at relatively high pressures, which
is an advantage.
13
Chapter 4
Hand Configuration Tracking
4.1 Tracking Procedure
In the previous section, the details of the device - The Tango were discussed. We saw
that the design of the device enables whole hand force measurements to be available
as well as virtual objects be manipulated in a natural way. This chapter discusses
the entire tracking module, where we try to estimate and track the hand grasp
configuration from the Tango Pressure-Sensor readings. By grasp configuration we
mean, the position, orientation and the joint angle configurations of the hand with
respect to the tango. As the user changes the grasp configuration on the tango, the
system tries to track it and depict it on the scene.
Articulated motion tracking like that of hands is a challenging problem be-
cause the motion exhibits many degrees of freedom. The human hand motion can
be characterized by 27 degrees of freedom, 21 for the joint angles and 6 for ori-
entation and location. Here we are trying to estimate this high dimensional state
space given readings from the pressure sensors. To simplify this problem due to
limited measurements, we propose to track a 11 DOF geometric hand model. The
presented approach, first smooths out noise from the measurements, assigns finger
contact points to the pressure readings and uses these measurements to track the
hand configuration using non-linear filtering techniques. A flowchart of the entire
14
tracking system is given below. Detailed description of each step with results follow
in the subsequent sections.
Figure 4.1: Hand Configuration Tracking- Schematic Diagram
15
4.2 Model of the Hand
There has been a lot of work on model based hand tracking using Vision Techniques.
Model based hand tracking is used widely in vision based systems. In [26], a 3D
model of a hand built from quadrics is used, and using projective geometry the
model is tracked from the video sequence. Other model based approaches include
[27] where they use a deformable 3d hand shape model which is constructed from
PCA training samples. In our approach we will used a geometric hand model to
track our hand configuration from the force input information from the Tango. A
simple hand model is developed and is fitted on a virtual object depicting the Tango.
Due to the limitations on the measurements, the model of the hand is kept
fairly simple. The whole hand has 11 degrees of freedom, 6 of which are for rigid
body position and orientation. This gives the relative position of the hand with
respect to the object it is holding. The orientation is defined with quaternions. The
hand grasp defined is a 3 finger grasp, hence the hand model is simplified to having
3 fingers - the thumb, index and middle fingers. The joint angles are
• Thumb metacarpal angle
• Thumb abduct angle
• Index and Middle finger plane metacarpal angle
• Index abduct angle
• Middle abduct angle
In total there are 11 degrees of freedom for the hand model. The idea is to track
this hand configuration on a virtual object, using the measurements received from
the tango. The model of the geometric hand model to be tracked is shown below.
Where
• Θt is the Thumb metacarpal angle.
16
Figure 4.2: Geometric Model of the Hand
• Φt is the Thumb abduct angle.
• Θp is the Index and Middle Finger Plane metacarpal angle.
• Φi is the Index finger Abduct angle.
• Φm is the Middle finger Abduct angle.
17
4.3 Processing the Measurements from the Sensor
The Tango as described earlier has two kinds of sensors- A grid of pressure sen-
sors and accelerometer. For tracking the hand grasp configuration we use only the
measurements from the pressure sensors. The pressure sensors on the device can be
thought of as arranged along meridians and parallels. The raw input to the system
is the readings of these pressure sensors which is proportional to the amount of force
applied to the particular taxel.
The final goal is to fit a simple hand model to an object using the Tango readings
such that the finger tips pertain to the activated taxels. The configuration of the
hand is to be tracked over time with the changing readings of the pressure sensors.
The taxels on the tango are considered to be on parallels and meridians along the
spherical body of fixed radius. Each taxel is therefore assigned its 3-D position with
respect to the tango coordinates. The position units are kept consistent in both the
Tango Coordinates and the hand model. The user grasps the tango, assuming it is
a three finger grasp, three most distinguished clusters from the input pressure data
are detected. These correspond to the finger tip positions of the hand model and
form the basis of the tracking measurements from the observed data.
The pressure data signal is post processed before detection of clusters to remove
false alarms and is discussed in the next section. The three finger tip clusters are
labelled into thumb, index and middle finger tips using some heuristics again de-
scribed later. The diagram below shows a 3-finger hand grasp on the Tango and the
final measurements after processing which are the 3D finger tip positions in Tango
Coordinates of the hand.
18
Figure 4.3: Measurements for the tracking procedure
The various steps for extracting the desired measurements for tracking is
given in the following diagram and each of the steps is described in subsequent
sections with the results and the impact of the stage.
19
50
100
150
200
250
300
350
400
450
500
Figure 4.4: Measurements for the tracking procedure
4.3.1 Pressure Signal Change Detection
Our signal change detection technique is kept relatively simple, with a conservative
idea to eliminate all false alarms of change. There are many techniques available,
which deal with signal estimation problem, i.e estimating the signal parameters
after eliminating the noise. This basic problem is tackled in adaptive filtering where
all that is available are measurements of that signal corrupted by noise and the
characteristics of the noise or signal may change with time. A smoothing filter is
adopted and the signal is extracted by performing a weighted average of a number
of past measurements. Appropriate weights need to be applied and a large body
20
of work exists suggesting different methods for adaptively setting these weights. If
the signal or noise characteristics change gradually, then the weights will adjust
smoothly to cope with the new situation and the signal will continue to be well
estimated. However, if the characteristics change suddenly, then the filter will take
some time to adapt to the new conditions and in the meantime the quality of the
signal estimate will be poor. This problem arises because of a trade-off implied
by the averaging process embedded in the filter; the more measurements used in
the average, the better the quality of the steady-state estimate, but the slower the
response to changes. ted is from the online change We adopt the method detection
in mean like in [28, 29]. The output of the pressure sensors from the tango, are a
series of float values, indicating the value of each pressure sensor of the grid, which
is directly proportional to the force applied on the sensor. Each sensor has its own
zero and a small variance around it when no pressure is applied to it. The problem
arises when due to noise, a sudden increase in the readings, causes a false alarm
and a grasp is detected. This occurs sometimes when other pressure sensors are
tipped when force is applied to other sensors in the same band etc. Thus the need
for a robust change detection technique is required to eliminate false alarms. If false
negatives are not detected the system greatly suffers from measurement noise.
Here we consider the pressure data signal for each sensor as a sequence of
independent random variables yk with a probability density pΘ(y). In our case the
signal for each sensor is gaussian with mean µ and variance σ2. We want to measure
the changing parameter Θ which here is the mean µ. Before the change time Θ is
Θ0 and after the change Θ is Θ1. The problem is to detect and estimate this change
in this mean value. For this samples with fixed size N are taken, and at the end
of each sample a decision rule is computed to test between two hypothesis - Is the
pressure sensor active or inactive or in other words, is the mean value µ0 or µ1. The
two Hypothesis are given below
21
H0 : Θ = Θ0 (4.1)
H1 : Θ = Θ1 (4.2)
The logarithm of the likelihood ratio as
s(y) = lnpΘ1(y)pΘ0(y)
(4.3)
Let
Skj =
k∑
i=j
si (4.4)
si = lnpΘ1(yi)pΘ0(yi)
(4.5)
be the sum of the log likelihood ratio for the observations from yj to yk. We will
refer to si as the sufficient statistic. From the Neyman-Pearson lemma [29], for a
fixed sample size of N, the optimal decision rule d is given by
d =
0 ifSN1 < hi −H0ischosen
1 ifSN1 > hi −H1ischosen
(4.6)
where h is a conveniently chosen threshold. The sum SN1 is said to be the decision
function. The decision is taken with the aid of a stopping rule which is defined by
ta = N.min{K : dk = 1} (4.7)
where dk is the decision rule for sample number K and ta is the alarm time. The
observation is thus stopped after the first sample of size N for which the decision is
in favor of H1. Suppose the mean reading when a particular pressure sensor is not
pressed is µ0 and when it is pressed µ1. When more force is applied to the sensor
the value is very high, but since we are concerned with only the state being on or
off, two changing parameters are sufficient to model it. The variance is constant σ2.
The probability density is
pΘ(y) =1
σ√
2πe−
(y−µ)2
2σ2 (4.8)
22
si =µ1 − µ0
σ2(yi − µ0 + µ1
2) (4.9)
We can write this in terms of change in magnitude and signal to noise ratio as
si =b
σ(yi − µ0 − v
2) (4.10)
where v = µ1 − µ0 is the change in magnitude and b = µ1−µ0
σ is the signal to noise
ratio. Thus the decision function is
SN1 =
b
σ
N∑
i=1
(yi − µ0 − v
2) (4.11)
Repeated testing of these two hypothesis we can find the change in the signal value.
We can also use the idea of exponential weighting of observations and obtain more
conservative results. For this if gk is the decision function at step k i.e
Then at step k the decision function now is
gk = gk−1 + (1− α)SN1 (4.12)
By modifying the decision function, we now weight present observations by
the past observations. We also found having different variance values for the two
means, where σ21 > σ2
0 gives better performance as the range of values in which the
pressure sensor is active is higher than when it is not active.
The sufficientstatistics si which is the log likelihood ratio now becomes
si = lnσ0
σ1+
12σ0σ1
((yi − µ0)2 − (yi − µ1)2) (4.13)
The decisionfunction gk remains the same
gk−1 + (1−α)SN1 and SN
1 =∑N
i=1 si The graph below shows the value of a selected
pressure sensor over time. The values of taken from a sensor over time with some
simulated noise added to show the working of the algorithm under such scenarios.
23
0 100 200 300 400 500 600 700 800 900 10000
1
2
3
4
5
6
7
8
Figure 4.5: Steps for gathering measurments
This particular sensor was tipped at time step 500, and the results using the
above algorithm where the same variance was used for determining the change is
shown below. The green line indicates the state of the alarm.
24
0 200 400 600 800 1000 12000
1
2
3
4
5
6
Figure 4.6: Mean Signal value per Sample Window
0 100 200 300 400 500 600 700 800 900 10000
1
2
3
4
5
6
7
Figure 4.7: Change detection Results - Same variance of both
Using differnt variance for the different states of the Sensor. Note that only
when the sensor is tipped from time period 500, the alarm is set off with minimal
25
false alarms.
0 100 200 300 400 500 600 700 800 900 10000
1
2
3
4
5
6
Figure 4.8: Change detection Results - Different variance
4.3.2 Assignment of Fingers
Once we have the pressure data a sequence of readings from every pressure sensor,
we try to find out three most dominant clusters of active sensors. These clusters
give the three finger tips, and we find their positions in tango coordinates. For the
first few iterations we assign finger-tips using the simple heuristics that the thumb
if farthest from the index and middle finger and in a left handed grasp on a sphere,
the thumb, index and middle fingers are in counter clockwise order. For the later
iterations we weight the decision with this heuristics and the configuration which is
closest in proximity to the previous assignment.
4.3.3 Grasp Validation
Once we have the three finger tip points, to validate the grasp a detection algorithm
using likelihood ratio testing is implemented through the use of recursive dynamic
26
filtering. The importance of this step is to eliminate false grasp detection which is
caused due to the mechanical coupling of the sensors. It has been observed when
grasped particularly with a lot of force, sensors near the activated sensors get tipped,
leading to false grasp configuration determination. For the grasp change detection
and validation approximate filtering algorithms based on Bayes’ law can be employed
in the current framework. It is like the model validation problem where one wishes
to determine whether a dynamic system is accurately described by a given model.
A model in this case is defined by 3, pressure sensors of the tango, each
defined by the parallel and meridian. It may be thought of as a grasp mode. These
3 taxels relate to the finger tips of the three fingers which hold the body. We reduce
the space of pressure sensors by grouping 4 taxels close to each other into one. Each
state is given a prior which is inversely proportional to the distance of its centroid
from the tango center. (Intuitively each state means the configuration of the hand
grasp i.e the finger tip positions of three fingers and at any time the user holds the
tango so that the centroid is the center of mass of the tango to balance it.) States
whose centroid are far away from the center of mass of the tango are unlikely and
hence allotted a lesser prior.
Now let yk be the output of the clustering stage - i.e the three finger-tip
contact taxels. Let Θ0...Θn be the n possible grasp modes as described earlier.
Change detection is the problem of determining a change in the control parameter
Θ from the previous state Θprev to the current state Θcur with the hypothesis Hprev
and Hcur respectively. Let Y N be the set of all input from y1 to yN and g be the
decision function with domain as Y N and range as Hprev,Hcur. In this case the
decision is given by
g(Y N ) =
Hprev where pcur(Y N )pprev(Y N )
< λ
Hold where pcur(Y N )pprev(Y N )
≥ λ
(4.14)
where pi is the probability distribution function. Let SN be the log likelihood
27
ratio, then
SN = lnpcur(Y N )pprev(Y N )
(4.15)
which can be given by SN = SN−1 + sN where sk = ln pcur(yk|Y k−1)pprev(yk|Y k−1)
. Therefore to
recursively calculate SN , a probability distribution for yk conditioned on past values
of Y k−1 must be available for each Θ. Depending on the value of S calculated the
system either accepts the change or rejects it. Finite number of previous records
are kept and they are weighted with time to increase importance of more recently
observed states. SN = αSN−1 + (1− α)sN .
The following figure depicts the results of the following step. A sequence
of grasp configurations is taken from the Tango and random noise is added to the
sequence. The results show the detection of a false grasp.
// Figure to be added.....For now a temp. figure
28
0 100 200 300 400 500 600 700 800 900 10000
1
2
3
4
5
6
7
Figure 4.9: Change detection Results - Same variance of both
29
4.4 Tracking Hand Configuration with Kalman Filters
The tracking of the 3D Geometric Model with measurements from the pressure
sensors can be thought of a non-linear estimation problem. This problem allows
us to use any of the non-linear estimation techiques, where the dynamics of the
system can be modelled by differential equations and whose state can be related
algebraically to measurements. We have tried Extended Kalman Filtering [30, 31],
Least Squares Approximation and the Sequential Monte Carlo Methods [32]. The
results follow from subsequently and have found better results using the Extended
Kalman Filter. In this section we will describe our tracking procedure with Extended
kalman filters.
Kalman filters have been used extensively for tracking hand configuration
especially where the problem of tracking is formulated as a non-linear system. The
Extended Kalman Filter is a modification of the Linear Kalman Filter and can han-
dle nonlinear dynamics and nonlinear measurement equations. It has been used in a
variety of tracking problems using various sensors. Some related work is presented
in the Related Work section.
30
The diagram below shows the hand model to be fit over the Tango Grasp.
Figure 4.10: Model fitting on the Tango
31
The Kalman filter [16] estimates the process by using a form of feedback
control: the filter estimates the process state at some time and then obtains feedback
in the form of measurements. It can be described as a set of Mathematical equations
which uses the underlying process to estimate the current state of the system, and
then correct it based on current measurements.
We define the state of the hand at time k to be xk = Position, Orientation, JointAngles
where
Position = (px, py, pz) is the position of the hand in the Tango Frame,
Orientation(w, x, y, z) a quaternion describing the orientation of the hand in the
tango frame.
JointAngles = (Thumb−MetacarpalAngle, Thumb−AbductAngle, IndexandMiddle
fingerplane−metacarpalAngle, Index−AbductAngle, Middle−AbductAngle)
Thus xk is a 12 dimension vector depicting the 11 degree of freedom hand model
and is the process to be estimated.
The filter represents rotations using quarternions rather than Euler angles to elim-
inate the problem of singularities associated with them. The output of the model
are the three finger tip locations in 3D World Coordinates.
The extended Kalman filter can be described by the two equations below. This filter
uses a linearization of the state and observation equations about the current best
estimate of the state to produce minimum mean-square estimates. Here the process
is a discrete time control process governed by
xk = f(xk−1, wk−1) (4.16)
The measurements are the three finger tips which can be given by forward kinematics
zk = h(xk, vk) (4.17)
The real measurements are a 9 dimensional vector corresponding to the 3 finger tip
positions.
zk = (thumbx, thumby, thumbz, indexx, indexy, indexz) (4.18)
32
where The thumb finger tip position is (thumbx, thumby, thumbz)
The index finger tip position is (indexx, indexy, indexz)
The middle finger tip position is (middlex, middley,middlez)
h is the forward hand kinematic function to compute the finger-tip position,
wk is the process error and
vk is the measurement error.
It has a well defined matrix of partial derivatives with respect to the state.
Here xk and zk are the actual state and measurement vectors. Let x−k be the a priori
predicted estimate of the state at time k and xk be the a posteriori estimate of the
state at time k.
33
One iteration of the Extended kalman Filtering Algorithm is as follows
• Consider the latest state estimate from the Filter xk
• Linearize the system dynamics by xk+1 = f(xk) + wk
• Apply the prediction step to the system dynamics to get predicted state and
Error Covariance x−k+1 and P−k+1
• Linearize the observation dynamics for time step k + 1 zk+1 = h(xk+1) + vk+1
• Apply the measurement update step to the observation to get final estimated
state and Error Covariance xk+1andPk+1
Details of the time and measurement update steps are given below.
34
Then the filter equations are as follows as follows
Time Update
x−k = f(xk−1) (4.19)
P−k = AkPk−1A
Tk + WkQk−1W
Tk (4.20)
Measurement Update
Kk = P−k HT
k (HkP−k HT
k + VkRkVTk )
−1(4.21)
xk = x−k + Kk(zk − h(x−k , 0)) (4.22)
Pk = (I −KkHk)P−k (4.23)
where Pk is the error covariance at time k
P−k is the a priori estimate of the error covariance at time k
A is the jacobian of partial derivatives of f with respect to x, that is
A[i,j] = ∂f[i]
∂x[j](xk−1)
W is the jacobian of partial derivatives of f with respect to w, that is
W[i,j] = ∂f[i]
∂w[j](xk−1)
H is the jacobian of partial derivatives of h with respect to x, that is
H[i,j] = ∂h[i]
∂x[j](x−k )
V is the jacobian of partial derivatives of h with respect to v, that is
V[i,j] = ∂h[i]
∂v[j](x−k )
For better tracking, two trackers are run in parallel, one for the whole hand
with the 12 dimension state vector for the 11 degrees of freedom and one only for the
5 joint angles. Depending on the initial estimate of the kind of object manipulation
at the input processing state - only movement of fingers on the object or whole hand
grasp change, the appropriate filter tracking is executed. For both trackers process
noise is kept relatively high.
35
4.4.1 Least Squares Estimation
We have tried to estimate the state of the Hand Configuration using the Least
Squares Method, and have compared the results obtained with that of tracking
using Kalman Filters. Consider the least squares problem of the form
minimize
f(x) =∥∥∥g(x)2
∥∥∥ (4.24)
subject so some constraints.
We can write our observation dynamics f in terms of state x as
f = h(x) (4.25)
where h is the forward hand kinematics. This can be written as
f = h(x) = h(x0) + H(x0)(x− x0) (4.26)
where x0 is the intial configuration of the state
H the jacobian of partial derivatives of h with respect to x, that is
H = ∂h∂x . We try to minimize
(x− x0)T (x− x0)2
(4.27)
such that
(f − h(x0)) + H(x− x0) = 0 (4.28)
Using Lagrangian Multipliers we can solve the solution is obtained by
(x− x0) = −HT (HHT )−1
(f − h(x0)) (4.29)
36
4.4.2 Results
To determine how well tracking with Kalman filtering performed, we tracked the
hand grasp over the Tango over time and compared it with data obtained from a a
CyberGlove TM [2], which provides 22 joint angle measurements. Below are some
results obtained, showing the tracking results of some of the state parameters over
time - using Kalman Filtering and the Least Squares Estimation.
0 10 20 30 40 50 60 70 80 90−0.7
−0.6
−0.5
−0.4
−0.3
−0.2
−0.1
0
0.1
0.2
0.3
Glove DataKalman FilteringLeast Squares
Figure 4.11: Tracking Results Comparison
The plot above shows the results the Index Finger Abduct Angle over time.
The red plot is the data from the glove, the blue plot depicts results of Kalman
Filtering and the Green plot is of the Least Square Estimates. The plot shows the
magnitude of the angle in radians. The Kalman Filter is more accurate than the
37
LSQ in this case, though it responds slowly to changes when compared to the Glove.
Below are the results of the Root Mean Square Errors for the above results for the
parameter - Index abduct angle φi. The RMS is given by
RMS = 2
√√√√ 1n
N−1∑
i=0
e2i (4.30)
where ei is the error
EKF LSQ
0.07467323 0.135929
38
4.5 Tracking with Particle Filters
One of the initial methods tried for tracking was particle filters which provide a
Bayesian framework for tracking. [22, 23] It uses the Markov assumption that the
past and future data are conditionally independent if one knows the current state.
The hand model is the same described earlier. The posterior density is represented
by a set of weighted particles and the state is then estimated as the mode of these
weighted particles. The state to be estimated here is the 11 degree of freedom hand
model, a 12 dimension vector X. Let the set of observations are Zk = {z1...zk} de-
notes the history of observations and the posterior density p(X|Zk) be represented
by a set of N weighted particles Sk = {(s(0)k , π
(0)k ).....(s(N)
k , π(N)k ) where the weights
π(n)k ∝ p(Zk|X = s
(n)k ). These weights called importanceweights sum up to one and
are non-negative. The basic form of the particle filter updates the belief according to
following procedure called sequential importance sampling with re-sampling given
below :
Re-sampling: Draw a random sample s(i)k−1 from the sample set Sk−1 according to
the distribution of the weights π(i)k−1.
Sampling: Use s(i)k−1 to sample s
(j)k from the distribution p(sk|sk−1) denoting the
dynamics of the system.s(j)k now represents the density given by p(sk|sk−1)p(X|Zk).
Importance Sampling: Weight the sample s(j)k by the importance weight p(zk|x(j)
k ),
the likelihood of the sample x(j)k given the measurement zk. After n iterations, the
weights are normalized so that they sum up to 1.
We try to estimate the stateXk at time step tk by finding the mode of these
weighted particles by
For each particle s(j)k with weight π
(j)k at time tk evaluate the function
39
f (j) = π(j)k
∑Ni=1,i 6=j Dist(i, j)
where Dist(i, j) is inversely proportional to the distance of between respective di-
mension of s(j)k and s
(i)k . The particle with maximum value of f is chosen as the
mode.
40
Chapter 5
Tango Position and Attitude
Estimation
To estimate the location of the Tango, we have to solve two problems which can be
decomposed into two stages - orientation estimation followed by position estimation.
Details of each of these are given below.
5.1 Attitude Estimation under low acceleration
The idea follows from [1] where under low acceleration the orientation of the ac-
celerometer can be estimated using Kalman filters.
The Tango has an accelerometer which measures both the acceleration and
the projection of the gravitational acceleration on the accelerometer local frame.
Since it is fixed to the Tango, the measurements take place in the Tango Frame.
For convenience lets assume the accelerometer in the Tango gives the acceleration
of the Tango itself and they have the same reference frame. The relation between
the gravitational acceleration and the Tango acceleration can be given by
y =~f(t)m
= ~acc(t)−R~gN (5.1)
41
where y is the Tango’s non-gravitational acceleration.
~acc(t) is the measurement from the Tango’s accelerometer
R is the orientation matrix of the World Frame -N with respect to the Tango and
~g is the gravitational acceleration in the world frame - N
Precise knowledge of R is mandatory to extract y accurately. Our aim is to estimate
this attitude, considering it as a Rigid body which is undergoing translations and
rotations in inertial space. The kinematics of a rigid body are
p = u (5.2)
R = S(ω)R (5.3)
where u is the acceleration expressed in the world-N frame i.e
~acc(t) = Ru from equation 3 and
S(w) =
0 ω3 −ω2
−ω3 0 ω1
ω2 −ω1 0
(5.4)
ω = (ω1, ω2, ω3) is the angular velocity in the Object’s (Tango’s) reference frame.
Thus we have
y = R(u− gN ) (5.5)
The idea is to consider u as measurements from the Tango which we can measure
partially from the accelerometer. If we take the yaw(Ψ)− pitch(Θ)− roll(φ) para-
metrization of R i.e
R =
cos(ψ)cos(θ) sin(ψ)cos(θ) −sin(θ)
−sin(ψ)cos(θ) + cos(ψ)sin(θ)sin(φ) cos(ψ)cos(φ) + sin(ψ)sin(θ)sin(φ) cos(θ)sin(φ)
sin(ψ)sin(φ) + cos(ψ)sin(θ)cos(φ) sin(ψ)sin(θ)cos(φ)− cos(ψ)sin(φ) cos(θ)cos(φ)
(5.6)
If the third column of R can be estimated we can get the pitch and roll angles
from it as it is independent of the yaw. Let x = r3 be the third column of R, and
42
the state to be estimated. We know that gN = −ge3 where g = 9.81m/s2 is the
gravitational constant and e3 is the third unit vector. For simplicity let us redefine
y = y/g, u = u/g. Thus we have
x = S(ω)x, ||x0|| = 1 (5.7)
y = x + Ru (5.8)
Since the above structure is linear we can use a Linear Kalman filter to estimate the
state x under low accelerations. We can define the discrete time model
xk+1 = Akxk + vk (5.9)
yk = xk + wk (5.10)
where Ak = eS(ωk)h has a closed form solution given by the Rodrigue’s formula [?]
Ak = I − S(ωk)sin(ωkh)||ωkh|| + S2(ωk)(1−cos(ωkh))
||ωkh||2 , vk is the process noise with covariance
Q and wk is the measurement noise with covariance R. To enforce the condition
||x|| = 1, let us assume the Kalman Filter estimates zk that is not of unit length
and we can obtain our state estimate by xk = zk||zk|| . The filter update equations are
thus
zk+1 = Akzk + Kk(yk − zk) (5.11)
Kk = AkP −K(Pk + R)−1 (5.12)
Pk+1 = AkPkA′k + Q−AkPk(Pk + R)−1A′kPk (5.13)
xk =
zk/||zk|| if zk 6= 0,
zk−1 if zk = 0(5.14)
5.2 Position Estimation
The second problem is to estimate the position of the Tango in the World-N Frame.
Under fairly high translational acceleration, the position of the object can be esti-
43
mated using a simple double integration.
~x(t) =∫ ∫
x(t)dtdt (5.15)
Note in the above equation ~x(t) is the position of the Tango in the world Frame and
~x(t) is the non gravitational acceleration of the Tango in the world Frame. We will
make use of the estimated attitude from the previous section to compute ~x(t) by
~x(t) = RN ~acc(t)− ~gN (5.16)
Where ~acc(t) is the measurement from the Tango accelerometer, RN gives the ori-
entation of the Tango with respect to the world computed from the estimated R
from the previous section and gN is the gravity vector in the world Frame.
44
Chapter 6
User-Interface to demonstrate
the Tango
6.1 3D Environment
We have built an experimental user interface to demonstrate the Tango. The goal
is to show some ways in which virtual objects can be manipulated using the whole
hand input from the Device. The interface allows users to navigate a 3D scene using
the Tango, and pick objects of interest each of which can be manipulated in different
ways using all or some of the sensors of the device. The interface is built in a Java
Platform using OpenGL for rendering. No other form of input is required. The
following screen shot is taken from the User-Interface and depicts and 3D virtual
world.
45
Figure 6.1: Screen shot of the User Interface
The interface presents the user with a front perspective view of the 3D scene
in the main window and three orthographic views are given to the left of the interface,
for better visualization of the scene. The scene is navigated using the Tango and
objects of interest can be grasped with the Tango. A clipper is rendered on the
scene pertaining to the movements of the Tango. Shadows are rendered of the
moving clipper and the objects, for better look and feel of the virtual environment.
In order to choose and object, the user has to grasp the Tango lightly and the nearest
46
object will be selected. Depending on the type of object, various actions can take
place. The following flow diagram depicts the actions taken by the interface an d
the underlying implementation to achieve the goal.
Figure 6.2: Flow Diagram of the User Interface
47
6.2 Scene Navigation
The scene can be navigated in three dimensions by moving the Tango using the
position estimation described in the previous chapter. The orientation of the Tango
is determined and consequently the position is determined in 3D world Coordinates.
Our goal was to make use of the sensor readings, so as to extend the behavior of
the Tango as a mouse moving in 3 dimensions. In order to mimic the behavior of
the mouse, a halting gesture is defined, so that the position of the Tango can be
fixed and it can be moved to a desired location without its effect being shown on
the scene. Once the tango is pointed to a particular object, it can be lightly grasped
to choose the nearest object.
6.3 Object Manipulation
The interface is essentially built to demonstrate the work done in the previous
sections, to utilize the sensor readings to achieve virtual object manipulation. For
this various kinds of object are randomly put in the scene, each having its own
distinct way in which can be interacted with.
48
6.3.1 Hand Grasp Tracking
To demonstrate our method to track hand grasp configuration, we define an object,
which when grasped, shows a virtual hand grasping it, with its state being driven by
results of the Tracking Procedure. The hand model is as defined above a 11-DOF
geometric model. We use this information to drive a Poser-Hand Model. The screen
shot below shows the demonstration of hand grasp tracking.
Figure 6.3: Hand Grasp Tracking - Screen Shot
49
6.3.2 Free-Form Buttons
Our goal here was to show the effectiveness of finger assignments in a grasp to the
active pressure sensors. Here we try to inherit the properties of a multi-button mouse
using the Tango. Once a grasp is determined, we assign fingers to the corresponding
grasp as described in Chapter 3. We can then press one finger slightly harder and
define it as an action. Thus regardless of the position of the finger on the device, we
can attribute it to a specific action. In our interface, we show an object changing
colors depending on which finger is applying more force. The following screen shot
shows the behavior we have defined
• Red - Thumb
• Green - Index
• Blue - Middle
Apart from these using various combinations of the grasp - 2 Fingers, 3 Fingers
other actions can be defined.
50
Figure 6.4: Free Form Buttons - Screen Shot
6.3.3 Object Picking
Here we aim to show, the effectiveness in which an object in a virtual environment
can be picked up realistically by grasping it with the whole hand, moving it to any
position and releasing the tight grasp to drop it.
51
Figure 6.5: Object Manipulation - Screen Shot
6.3.4 Force Application
The goal here is to directly map the pressure sensors on the Tango to points on an
object surface and apply forces to the object, using the whole hand input. It gives
the base to model deformable objects in future work.
52
Figure 6.6: Force Application - Screen shot
6.3.5 Implementation and Performance
53
Chapter 7
Conclusion and Future Work
The Tango is a novel, graspable, whole-hand input device. The device can measure
contact pressure distribution during grasping and manipulation at 100Hz, and device
acceleration, and transmit these signals to a host computer using USB. A pressure
sensing method using only digital drive signals is used. We expect the device to
form the basis of new computer interfaces for manipulating 3D objects.
The Tango is a multi-purpose user input device. We have tried to show
various uses of the Tango in the interface and also given an approach to track the
Hand Grasp Configuration using the force input. However there are many other
possible applications in which the design of the Tango could prove very effective.
The following are some possible applications.
• Whole hand sensing of grasping and manipulation of 3D objects (e.g., de-
formable objects).
• 3D shape sculpting (e.g., to build geometric models for computer animation
and free-form CAD surface design). Users can treat the object represented by
Tango as 3D ”clay.”
• 3D navigation (e.g., to show which direction in 3D to view a virtual environ-
ment by touching appropriate points on Tango).
54
• Free form “buttons.” Buttons can be assigned to fingers, and not locations on
device, so users can use the most comfortable hand position.
• Full 3D mouse (with six degrees of freedom), by combining acceleration read-
ings with locations of fingers on the device.
Future work can also involve improving the tracking of the hand grasp con-
figuration, by combining the Tango Input with other sensors(TODO).
55
Bibliography
[1] Henrik Rehbinder and Xiaoming Hu. Drift free attitude estimation for ac-
celerated rigid bodies. In The Proceedings of the 2001 IEEE International
Conference on Robotics and Automation, 2001.
[2] Immersion Corporation. CyberGlove,
http://www.immersion.com.
[3] VPL Research.
[4] Exos. Dextrous HandMaster,
http://www.exos.com.
[5] Sturman and Zeltzer. A survey of glove-based input. In IEEE Computer Graph-
ics Application, 1994.
[6] Shan Lu, Gang Huang, Dimitris Samaras, and Dimitris Metaxas. Model-based
integration of visual cues for hand tracking. In WMVC, 2002.
[7] Cullen Jennings. Robust finger tracking with multiple cameras. In The Interna-
tional Workshop on Recognition, Analysis, and Tracking of Faces and Gestures
in Real-Time Systems, pages pages 152–160, 1999.
[8] Mourad Bouzit, Grigore Burdea, George Popescu, and Rares Boian. The rutgers
master ii new design force feedback glove. In IEEE/ASME Trasactions on
Mechatronics, 2002.
56
[9] M. Insko, B.and Meehan, M. Whitton, and F. P. Brooks Jr. Passive haptics
significantly enhances virtual environments. Technical report, Computer Sci-
ence Technical Report 01-010, University of North Carolina, Chapel Hill, NC,
2001.
[10] S. Zhai, P. Milgram, and W Buxton. The influence of muscle groups on perfor-
mance of multiple degree-of-freedom input. In Proceedings of CHI ’96, 308-315,
1996.
[11] R. D. Howe. Tactile sensing and control of robotic manipulation. In The Journal
of Advanced Robotics, 1994.
[12] R. S. Fearing. Tactile sensing mechanisms.
[13] Pressure Profile Systems Inc. ConTacts,
http://www.pressure-profile.com.
[14] XSensor Technology. Xsensor,
http://www.xsensor.com.
[15] Strecker and Benedikt. Seminar: Methods and tools in medical imaging.
[16] Greg Welch and Gary Bishop. An introduction to kalman filter. Course, Sig-
graph, 2001.
[17] Nobutaka Shimada, Yoshiaki Shirai, Yoshinori Kuno, and Jun Miura. Hand
gesture estimation and model refinement of monocular camera ambiguity lim-
itation by inequality contraints. In Intl. Conference on Automatic Face and
Gesture Recognition, 1998.
[18] Christoph Bregler. Learning and recognizing human dynamics in video se-
quences. In IEEE Conf. Comp. Vision and Pattern Recognition, 1997.
[19] B Stenger, P R S Mendonca, and R Cipolla. Model based hand tracking using
an unscented kalman filter. In British Machine Vision Conference, 2001.
57
[20] Lin Chai, William Hoff, and Tyrone Vincett. Three-dimensional motion and
structure estimation using inertial sensors and computer vision for augmented
reality. In Presence: Teleoperators and Virtual Environments, 2002.
[21] Kenji Oka, Yoichi Sato, and Hideki Koike. Real-time tracking of multiple fin-
gertips and gesture recognition for augmented desk interface systems. In IEEE
International Conference on Automatic Face and Gesture Recognition, 2002.
[22] Michael Isard and Andrew Blake. Condensation - conditional density propaga-
tion for visual tracking. In Intl. Journal of Computer Vision, 1998.
[23] Deusher, Blake, and Reid. Articulated body motion capture by annealed par-
ticle filtering. In CVRP, 2000.
[24] Hanning Zhou and Thomas S Huang. A bayesian framework for real-time 3d
hand tracking in high clutter background. In HCI International, 2003.
[25] Ming-Hsuan Yang and Narendra Ahuja. Recognizing hand gesture using mo-
tion trajectories. In IEEE CS Conference on Computer Vision and Pattern
Recognition, 1999.
[26] B Stenger, P.R.S Mendonca, and R Cipolla. Model based hand tracking using
and unscented kalman filter. In BMVC, 2001.
[27] A.J Heap and D.C Hogg. Towards 3-d hand tracking using a deformable model.
In International Face and Gesture Recognition Conference, 1996.
[28] Michele Basseville and Igor Nikiforov. Detection of Abrupt Changes: Theory
and Applications. Prentice-Hall.
[29] M. Tyler and M Morari. Change detection using non linear filtering and likeli-
hood ratio testing. In Technical Report vol. AUT96-17, 1996.
[30] A.H Jazwinski. Stochastic processes and filtering theory. In Academic Press,
New York, 1970.
58
[31] Y Bar Shalom and T.E Fortmann. Tracking and data association. In Number
179 in Mathematics in science and engineering. Academic Press, Boston, 1988.
[32] A Doucet, N.G de Freitas, and N.J Gordon. Sequential monte carlo meth-
ods in practise. In Statistics for Engineering and Information Sciences Series,
Springer-Verlag, 2001.
59