basics of representations - web.stanford.edu · basics of representations (and traditional...

Basics of Representations(and traditional low-level representations)

CS331B: Representation Learning in Computer VisionAmir R. Zamir

Silvio Savarese

(class logistics)● Student paper presentations for 10/12

○ Discriminative learning of deep convolutional feature point descriptors, Simo-Serra, E., Trulls, E., Ferraz, L., Kokkinos, I., Fua, P., & Moreno-Noguer, F., ICCV15

○ Data-Driven 3D Voxel Patterns for Object Category Recognition, Yu Xiang, Wongun Choi, Yuanqing Lin & Silvio Savarese., CVPR15.

○ Convolutional-recursive deep learning for 3d object classification, Socher, R., Huval, B., Bath, B., Manning, C. D., & Ng, A. Y., NIPS12.

2

(class logistics)● A few conceptual and ML oriented papers towards the end of the quarter:

○ Representation learning: A review and new perspectivesY Bengio, A Courville, P Vincent, 2013 PAMI

○ Intelligence without representationRA Brooks - Artificial intelligence, 1991 Elsevier

● Additional ideas for student presentations (extensive papers, talks, etc.) -- prior approval needed.

3

What we talked about so far...

4

Things... Our Knowledge...

5

“Transcript”

Cat

Macbeth was guilty.

6

“Transcript”

Cat

Macbeth was guilty.

[ 81 20 84 64 58 39 17 54 72 15]

Representation Mathematical Model (e.g., classifier)

7

~12 lbs

~8 lbs

-5 0 +207 1511

X XXX XXX XXX XX X XXX XXX XXX XX

w

Weight (w)

Representation Mathematical Model (Classifier)

w>11

X X

Type B

Type A

8

Represent these cats for a cat detector!

9

Represent these cats for a cat detector! (II)

10

Represent these cats for a cat detector! (III)

11

Represent these cats for a cat detector! (IV)

12

Not always as easy (Happy vs Sad)

13

Not always as easy (Sad)

14

Color Histograms

Deformable Part based Models

(DPM)

Histogram of Gradients

(HOG)

Models based Shapes

15Felzenszwalb et al., 2010. Dalal and Triggs, 2005.Beis and Lowe, 1997.

This lecture...

16

Some basics concepts related to representations

17

Concepts● Ill-posedness● Readout Linearity ● Dimensionality● Computational Complexity ● Encoding power (i.e., performance)● Narrowness of application domain (vertical vs horizontal representations)

18

Ill-posedness

19C. F. Bohren, D. R. Huffman, 1983.

Ill-posedness

20

http://www.youtube.com/watch?v=A4QcyW-qTUg

Ill-posedness

21

Ill-posedness● 3D pose estimation from 2D gradients is an ill-posed problem.

○ 2D gradient representation is ill-posed wrt 3D pose. ○ 2D gradient representation+full semantics is NOT ill-posed wrt 3D pose.

22

Linearity

23

Linearity

24

● Readout linearity → concerns modeling parameters → Linear classifier, FC● Representation non-linearity → concerns independent variables → ReLU, Neurons, etc.

Linear/Non-linear? Linear/Non-linear?

Linearity

25Linear/Non-linear Linear/Non-linear

● Readout linearity → concerns modeling parameters → Linear classifier, FC● Representation non-linearity → concerns independent variables → ReLU, Neurons, etc.

26

With respect to: {modeling parameters (decision) , independent variables (representation)}

Linear or Non-linear?

Independent var. (x,y)

Modeling Param. (a,b,c,r)

Linear non-Linear

Linear Linear

Decision boundary

Not discussing kernels, reparametrization, etc

Concepts● Ill-posedness● Readout Non-linearity ● Dimensionality● Computational Complexity ● Encoding power (i.e., single-task performance)● Narrowness of application domain (i.e., multi-task performance)

27

More discussions in Lectures 3 & 8

More discussions in Lecture 12

Classical low-level 2D Representations

28

Pixel Gradient based Features

29

Histogram of Gradients (and its descendants)

30Dalal and Triggs, 2005.

HOGgles!Representation ⇄ Data

31Vondrick et al. 2013..

HOGgles!Representation ⇄ Data


http://www.youtube.com/watch?v=y7l_TApARGc

HOGgles -- How: sparse coding

33

HOGgles!


HOGgles & ill-posedness


Hadamard well-posedness terms:1. A solution exists2. The solution is unique3. Solution's behavior is smooth

Affine-SIFT● Original SIFT: 4-DOF of affine

invariant (translation, scale, rotation)

● ASIFT -- basic idea: exhaustively transform images (w/ sampling and efficiency mechanisms) → then use original SIFT.

38Morel & Yu. 2009.

http://www.youtube.com/watch?v=iY6d5pBdRC8

Self-Similarity See the board!

39Junejo et al. 2008.

(spatial) Self-Similarity

40Shechtman & Irani, 2007.

Classical Video features

43

3D-SIFTA descriptor for volumetric data (temporal or 3D)

44Scovanner et al. 2007.

2D SIFT Multi-2D SIFT 3D SIFT

3D-SIFT

45Scovanner et al. 2007.

Spatio-temporal cubes Bag-of-words (~cubes) -- based on 3D SIFT similarity

Dense Trajectory Features

46Wang et al. 2011.

Lucas & Kanade. 1981.


47Wang et al. 2011.


48Wang et al. 2011.

Course webpage:http://web.stanford.edu/class/cs331b/

http://www.cs.stanford.edu/~amirz/http://cvgl.stanford.edu/silvio/

http://web.stanford.edu/class/cs331b/

http://web.stanford.edu/class/cs331b/

http://www.cs.stanford.edu/~amirz/

http://www.cs.stanford.edu/~amirz/

http://cvgl.stanford.edu/silvio/

http://cvgl.stanford.edu/silvio/

basics of representations - web.stanford.edu · basics of representations (and traditional...

Documents