Linear Subspaces and Projection in Computer Vision, Slides of Computer Vision

The concept of projecting high-dimensional images to lower-dimensional feature spaces in computer vision. It covers linear subspaces, linear projection, distance to linear subspace, and distance to affine subspace. The document also includes an application of face detection using 'distance to face space'.

Typology: Slides

2012/2013

Uploaded on 04/24/2013

baishali
baishali 🇮🇳

4

(2)

84 documents

1 / 6

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
1
CSE152, Spr 12 Intro Computer Vision
Recognition III
Introduction to Computer Vision
CSE 152
Lecture 19
CSE152, Spr 12 Intro Computer Vision
.
CSE152, Spr 12 Intro Computer Vision
Linear Subspaces & Linear Projection
An n-pixel image xR
d
can be
projected to a low-dimensional
feature space yR
k
by
y = Wx
where W is an k by d matrix.
Recognition is performed in R
k
using, for example, nearest neighbor.
How do we choose a good W?
Example: Projecting from R
3
to R
2
CSE152, Spr 12 Intro Computer Vision
Distance to Linear Subspace
An n-pixel image xR
d
can be
projected to a low-dimensional
feature space yR
k
by
y = Wx
From y R
k
, the reconstruction
of the point in R
d
is W
T
y=W
T
Wx
The error of the reconstruction,
or the distance from x to the
subspace spanned by W is:
||x-W
T
Wx||
CSE152, Spr 12 Intro Computer Vision
Distance to Affine Subspace
(i.e., Distance to Face Space)
Represented by mean vector µ and
basis images W
An n-pixel image xR
d
can be
projected to a low-dimensional
feature space yR
k
by
y = W(x-µ)
From y R
k
, the reconstruction of
the point in R
d
is
W
T
y+µ= W
T
W(x-µ)
The error of the reconstruction, or
the distance from x to the affine is:
||x-W
T
W(x-µ)||=||(I-W
T
W)(x-µ)||
x
1
x
2
x
3
x
y
µ
CSE152, Spr 12 Intro Computer Vision
Application 1:
Face detection using “distance to face space”
Scan a window
ω
across the image, and
classify the window as face/not face as
follows:
Project window to subspace, and
reconstruct as described earlier.
Compute distance between ω and
reconstruction.
Local minima of distance over all image
locations less than some threshold are
taken as locations of faces.
Repeat at different scales.
Possibly normalize windows intensity so
that |ω| = 1.
Docsity.com
pf3
pf4
pf5

Partial preview of the text

Download Linear Subspaces and Projection in Computer Vision and more Slides Computer Vision in PDF only on Docsity!

CSE152, Spr 12 Intro Computer Vision

Recognition III

Introduction to Computer Vision

CSE 152

Lecture 19

CSE152, Spr 12 Intro Computer Vision

CSE152, Spr 12 Intro Computer Vision

Linear Subspaces & Linear Projection

  • An n -pixel image xR d^ can be projected to a low-dimensional feature space yR k^ by

y = W x

where W is an k by d matrix.

  • Recognition is performed in R k using, for example, nearest neighbor.
  • How do we choose a good W?

Example: Projecting from R^3 to R^2

CSE152, Spr 12 Intro Computer Vision

Distance to Linear Subspace

  • An n -pixel image xR d^ can be projected to a low-dimensional feature space yR k^ by

y = W x

  • From yR k^ , the reconstruction of the point in R d^ is W T y=W T W x
  • The error of the reconstruction, or the distance from x to the subspace spanned by W is: || x - W T Wx||

CSE152, Spr 12 Intro Computer Vision

Distance to Affine Subspace

(i.e., Distance to Face Space)

  • Represented by mean vector μ and basis images W
  • An n -pixel image xR d^ can be projected to a low-dimensional feature space yR k^ by

y = W( x- μ)

  • From yR k^ , the reconstruction of the point in R d^ is W T y+μ= W T W( x- μ)+μ
  • The error of the reconstruction, or the distance from x to the affine is: || x - W T W( x- μ)-μ||= ||(I- W T W)( x- μ)||

x (^1)

x (^2)

x 3

x

y

μ

CSE152, Spr 12 Intro Computer Vision

Application 1:

Face detection using “distance to face space”

  • Scan a window ω across the image, and classify the window as face/not face as follows:
  • Project window to subspace, and reconstruct as described earlier.
  • Compute distance between ω and reconstruction.
  • Local minima of distance over all image locations less than some threshold are taken as locations of faces.
  • Repeat at different scales.
  • Possibly normalize windows intensity so that |ω| = 1.

CSE152, Spr 12 Intro Computer Vision

An important footnote:

We don’t really implement PCA by constructing a

covariance matrix!

Why?

1. How big is Σ?

• n by n where n is the number of pixels in an

image!!

2. You only need the first k Eigenvectors

CSE152, Spr 12 Intro Computer Vision

Singular Value Decomposition

  • Any m by n matrix A may be factored such that A = U Σ V T [m x n] = [m x m][m x n][n x n]
  • U : m by m , orthogonal matrix
    • Columns of U are the eigenvectors of AA T
  • V : n by n , orthogonal matrix,
    • columns are the eigenvectors of A T A
  • Σ : m by n , diagonal with non-negative entries (σ 1 , σ 2 , …, σs) with s=min(m,n) are called the called the singular values. SVD algorithm produces sorted singular values : σ 1 ≥ σ 2 ≥ … ≥ σs Important property
    • Singular values are the square roots of Eigenvalues of both AAT and A TA & Columns of U are corresponding Eigenvectors!!

CSE152, Spr 12 Intro Computer Vision

Performing PCA with SVD

• Singular values of A are the square roots of eigenvalues

of both AAT^ and ATA & Columns of U are

corresponding Eigenvectors

• And

• Covariance matrix is:

• So, ignoring 1/n subtract mean image μ from each input

image, create a d x n data matrix, and perform thin SVD

on the data matrix. D=[x 1 -μ | x 2 -μ | … xn-μ ]

aiai^ T i = 1

n

∑ =^ [ a 1 a 2 ^ an ] [ a 1 a 2 ^ an ]

T = AAT

n

x i −

i = 1

n

∑ (^

x i −

μ) T

CSE152, Spr 12 Intro Computer Vision

PCA & Fisher’s Linear Discriminant

• Between-class scatter

• Within-class scatter

• Total scatter

• Where

  • c is the number of classes
  • μi is the mean of class χi
  • | χi | is number of samples of χi..

χ 1 χ^2

If the data points xi are projected by yi=Wxi and the scatter of xi is S, then the scatter of the projected points yi is WTSW

CSE152, Spr 12 Intro Computer Vision

PCA & Fisher’s Linear Discriminant

• PCA (Eigenfaces)

Maximizes projected total scatter

• Fisher’s Linear Discriminant

Maximizes ratio of projected between-class to projected within-class scatter

χ 1 χ^2

PCA

FLD

CSE152, Spr 12 Intro Computer Vision

Computing the Fisher Projection Matrix

• The wi are orthonormal

• There are at most c -1 non-zero generalized

Eigenvalues, so m ≤ c-

• Can be computed with eig in Matlab

CSE152, Spr 12 Intro Computer Vision

Recognition

CSE152, Spr 12 Intro Computer Vision

Recognition in Cluttered Scenes

Interest Points + Feature

Descriptors + Relations

CSE152, Spr 12 Intro Computer Vision

Example

Training examples

Test image

CSE152, Spr 12 Intro Computer Vision

Matching using Local Image features

Simple approach

• Detect corners in image (e.g. Harris corner

detector).

• Represent neighborhood of corner by a feature

vector produced by Gabor Filters, K-jets, SIFT

features, etc.

• Modeling: Given an training image of an object

w/o clutter, detect corners, compute feature

descriptors, store these.

• Recognition time: Given test image with possible

clutter, detect corners and compute features. Find

models with same feature descriptors (hashing)

and vote.

CSE152, Spr 12 Intro Computer Vision

Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr, IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE CSE152, Spr 12 Intro Computer Vision

Employ spatial relations

Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr, IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE

CSE152, Spr 12 Intro Computer Vision

Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr, IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE

CSE152, Spr 12 Intro Computer Vision

Even without shading, shape reveals a lot

CSE152, Spr 12 Intro Computer Vision

Motion

Introduction to Computer Vision

CSE 152

Lecture 19-b

CSE152, Spr 12 Intro Computer Vision

Motion

Some problems of motion

  1. Correspondence: Where have elements of the image moved between image frames
  2. Reconstruction: Given correspondence, what is 3-D geometry of scene
  3. Ego Motion: How has the camera moved.
  4. Segmentation: What are regions of image corresponding to different moving objects
  5. Tracking: Where have objects moved in the image? related to correspondence and segmentation.

Variations:

  • Small motion (video),
  • Wide-baseline (multi-view)

CSE152, Spr 12 Intro Computer Vision

Structure-from-Motion (SFM)

Goal: Take as input two or more images or

video w/o any information on camera

position/motion, and estimate camera

position and 3-D structure of scene.

Two Approaches

1. Discrete motion (wide baseline)

1. Orthographic (affine) vs. Perspective

2. Two view vs. Multi-view

3. Calibrated vs. Uncalibrated

2. Continuous (Infinitesimal) motion

CSE152, Spr 12 Intro Computer Vision

Discrete Motion: Some Counting

Consider M images of N points, how many unknowns?

  1. Camera locations: Affix coordinate system to location of first camera location: (M-1)*6 Unknowns
  2. 3-D Structure: 3*N Unknowns
  3. Can only recover structure and motion up to scale. Why?

Total number of unknowns: (M-1)6+3N-

Total number of measurements: 2MN

Solution is possible when (M-1)6+3N-1 ≤ 2MN

M=2  N≥ 5

M=3  N ≥ 4