pose from action: unsupervised learning of pose features ... · printing: pose from action:...

1

Pose from Action: Unsupervised Learning of Pose Features based on Motion Senthil Purushwalkam, Abhinav Gupta Motivation Human Actions are an ordered sequence of poses Can we leverage human action videos to learn a pose encoding representation? Surrogate Task Overview Data and Implementation Appearance ConvNet Motion ConvNet A A’ T 13 Frames Frame 1 224x224x3 Optical Flow Maps 224x224x24 Frame 13 224x224x3 VGG-M-2048 Robotics Institute, Carnegie Mellon University A representation in which poses cluster together should be useful for Pose Estimation and Action Recognition Given two poses, it should be possible to predict the motion between them Given: Two appearance representations A and A’ (pose) One motion representation T (transformation of pose) Predict: If the transformation T could cause the change AA’ We use videos from UCF101, HMDB51 and ACT video datasets Sample two frames separated by ∆n (=12) frames and extract optical flow for the ∆n frames Use the VGG-M-2048 architecture for all CNNs Experiments Nearest neighbor in the FC6 feature space of the Appearance ConvNet Results Action Recognition: Pose Estimation: Static Image Action Recognition: 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% Full Network Full Network Last 2 layers Full Network Full Network Random Wang et. al Ours Ours ImageNet Classification Accuracy UCF101 HMDB51 0.00% 10.00% 20.00% 30.00% 40.00% 50.00% 60.00% 70.00% 80.00% Random UCF101 Pretraining Wang et. al Ours ImageNet Strict PCP evaluation Body Part Upper Arms Body Part Lower Arms 0.00% 5.00% 10.00% 15.00% 20.00% 25.00% Random Initialisation SSFA Ours Classification Accuracies Paper available on arxiv at: https://arxiv.org/abs/1609.05420 Conv1 Filter Visualisation:

Upload: others

Post on 07-Jul-2020

25 views

Category:

Documents

0 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Pose from Action: Unsupervised Learning of Pose Features ... · Printing: Pose from Action: Unsupervised Learning of Pose Features based on Motion Senthil Purushwalkam, Abhinav Gupta

Printing:

Pose from Action: Unsupervised Learning of Pose Features based on Motion

Senthil Purushwalkam, Abhinav Gupta

Motivation

Human Actions are an ordered

sequence of poses

Can we leverage human action

videos to learn a pose encoding

representation?

Surrogate Task

Overview

Data and Implementation

Appearance ConvNet

Motion ConvNet

A

A’

T

13 FramesFrame 1

224x224x3

Optical Flow Maps

224x224x24

Frame 13

224x224x3

VGG-M-2048

Robotics Institute,

Carnegie Mellon University

A representation in which poses cluster together should be useful for

Pose Estimation and Action Recognition

Given two poses, it should be

possible to predict the motion

between them

Given:

Two appearance

representations A and A’(pose)

One motion representation

T (transformation of pose)

Predict:

If the transformation T could

cause the change AA’

We use videos from UCF101,

HMDB51 and ACT video datasets

Sample two frames separated by ∆n

(=12) frames and extract optical flow

for the ∆n frames

Use the VGG-M-2048 architecture

for all CNNs

Experiments

Nearest neighbor in the FC6 feature

space of the Appearance ConvNet

Results

Action Recognition:

Pose Estimation:

Static Image Action Recognition:

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

FullNetwork

FullNetwork

Last 2layers

FullNetwork

FullNetwork

Random Wang et. al Ours Ours ImageNet

Classification Accuracy

UCF101 HMDB51

0.00%

10.00%

20.00%

30.00%

40.00%

50.00%

60.00%

70.00%

80.00%

Random UCF101Pretraining

Wang et. al Ours ImageNet

Strict PCP evaluation

Body Part Upper Arms Body Part Lower Arms

0.00%

5.00%

10.00%

15.00%

20.00%

25.00%

RandomInitialisation

SSFA Ours

Classification Accuracies

Paper available on arxiv at:

https://arxiv.org/abs/1609.05420

Conv1 Filter Visualisation:

https://arxiv.org/abs/1609.05420

Unsupervised Joint 3D Object Model Learning and 6D Pose …openaccess.thecvf.com/content_ICCVW_2019/papers/R6D/Wu... · 2019-10-23 · Figure 1. Overview of our unsupervised approach

Pose Tracking from Natural Features on Mobile Phones

Unsupervised Syntactic Category Induction using Multi-level Linguistic Features

Few-Shot Adaptive Video-to-Video Translation · pose 2 target pose combined features attention maps example features. ... Adaptive vid2vid: Testing •Given an example image •Finetune

Unsupervised Classification for Main Features …ir.lib.u-ryukyu.ac.jp/bitstream/20.500.12000/30654/3/...Main Features Extraction in Natural Disaster Text Sources 4. Carlos Enrique

P-CNN: Pose-based CNN Features for Action Recognition · 2015-09-24 · P-CNN: Pose-based CNN Features for Action Recognition Guilhem Cheron´ y Ivan Laptev Cordelia Schmidy INRIA

Two-person Interaction Detection Using Body-Pose Features ...people.csail.mit.edu/jhonorio/actrec_hau3d12.pdf · Two-person Interaction Detection Using Body-Pose Features and Multiple

john-russell.orgjohn-russell.org/Texts/7op.pdf · ocean pose. ocean pose. cxean pose. ocean pose _ ocean pose. ocean pose. ocean pose. ocean pose. ocean pose starting in the south

Unsupervised Syntactic Category Induction using Multi-level Linguistic Features Christos Christodoulopoulos Pre-viva talk

Unsupervised Word Alignment with Arbitrary Features #aclreading

Outline Facial Attributes Analysis Animated Pose Templates(APT) for Modeling and Detecting Human Actions Unsupervised Structure Learning of Stochastic

Unsupervised and Transfer Learning Challenges in …large unsupervised neural network (16,000 computer processors with one billion con-nections) to automatically develop features for

Unsupervised reﬁnement of color and stroke features …lear.inrialpes.fr/people/alahari/papers/mishra17.pdfMishra et al.: Unsupervised reﬁnement of color and stroke features for

Pose Inference on the Low-Level Features

Unsupervised Segmentation/Supervised Segmentationvision.psych.umn.edu/.../Papers/Lec10UnsupSegment.pdf · • top down segmentation (model based) –features belong together because

Unsupervised Face Normalization With Extreme Pose and …openaccess.thecvf.com/content_CVPR_2019/papers/Qian... · 2019-06-10 · Unsupervised Face Normalization with Extreme Pose

Learning Human Pose Estimation Features with Convolutional ... · to build better detectors. 3 Model To perform pose estimation with a convolutional network architecture [24] (convnet),

Unsupervised Pre-Training of Image Features on …...Unsupervised Pre-Training of Image Features on Non-Curated Data Mathilde Caron1,2, Piotr Bojanowski1, Julien Mairal2, and Armand

4 CAMERAS Unsupervised video orchestration based on ......RTSP 2017 EURASIP TUTORIAL DAY July 10, 2017, Bucharest 4 CAMERAS Unsupervised video orchestration based on Aesthetic features

Unsupervised Learning of Visual Features through Spike

Iterative Action and Pose Recognition using Global-and ...haci2013.umiacs.umd.edu/papers/UkitaHACI2013.pdfIterative Action and Pose Recognition using Global-and-Pose Features and Action-speciﬁc

DeepIM: Deep Iterative Matching for 6D Pose Estimation · RGB based 6D Pose Estimation: Traditionally, pose estimation using RGB im-ages is tackled by matching local features [16,23,4]

Unsupervised Learning. Supervised learning vs. unsupervised learning

Unsupervised Shape and Pose Disentanglement for 3D Meshes · 2020. 8. 5. · Unsupervised Shape and Pose Disentanglement for 3D Meshes 3 Fig.2. A schematic overview of shape and pose

Unsupervised Segmentation/Supervised Segmentationvision.psych.umn.edu/users/schrater/schrater_lab/courses/... · 2007-03-06 · General ideas • Features (tokens) –whatever we

Local Convolutional Features With Unsupervised Training for Image Retrieval · 2015-10-24 · Local Convolutional Features with Unsupervised Training for Image Retrieval Mattis Paulin

Unsupervised Learning of Categories from Sets of Partially Matching Image Features

Building High-level Features Using Large Scale Unsupervised Learning

Regression from Local Features for Viewpoint and Pose ... · Regression from Local Features for Viewpoint and Pose Estimation Marwan Torki Ahmed Elgammal Department of Computer Science

Cascaded Pose Regression - GitHub Pagespdollar.github.io/files/papers/DollarCVPR10pose.pdf · to return accurate pose estimates. In recent work, Ali et al. [2] used pose-indexed features

Cross-Dataset Person Re-Identification via …openaccess.thecvf.com/content_ICCV_2019/papers/Li_Cross...Cross-Dataset Person Re-Identiﬁcation via Unsupervised Pose Disentanglement

Unsupervised hyperspectral feature selection based on fuzzy c … · 2019. 2. 17. · unsupervised feature selection method is suggested to remove the redundant features of HSI by

Unsupervised Categorization of Objects into Artificial and Natural Superordinate Classes Using Features from Low-Level Vision

P-CNN: Pose-based CNN Features for Action Recognition · namic human pose features in cases when reliable pose es-timation is available. We extend the work [19] and design a new CNN-based

UNSUPERVISED LEARNING OF LOCAL FEATURES …ismir2012.ismir.net/event/papers/139_ISMIR_2012.pdf · UNSUPERVISED LEARNING OF LOCAL FEATURES FOR MUSIC CLASSIFICATION Jan Wulﬁng and