Prepare for your exams
Get points
Guidelines and tips
Sell on Docsity
Docsity AI

Prepare for your exams

Study with the several resources on Docsity

Earn points to download

Earn points by helping other students or get them with a premium plan

Guidelines and tips

Sell on Docsity

Docsity AI

Log in Sign up

Prepare for your exams

Study with the several resources on Docsity

Find documents

Prepare for your exams with the study notes shared by other students like you on Docsity

Search for your university

Find the specific documents for your university's exams

Docsity AINEW

Summarize your documents, ask them questions, convert them into quizzes and concept maps

Explore questions

Clear up your doubts by reading the answers to questions asked by your fellow students

Earn points to download

Earn points by helping other students or get them with a premium plan

Share documents

20 Points

For each uploaded document

Answer questions

5 Points

For each given answer (max 1 per day)

All the ways to get free points

Get points immediately

Choose a premium plan with all the points you need

Study Opportunities

Choose your next study program

Get in touch with the best universities in the world. Search through thousands of universities and official partners

Community

Ask the community

Ask the community for help and clear up your study doubts

Free resources

Our save-the-student-ebooks!

Download our free guides on studying techniques, anxiety management strategies, and thesis advice from Docsity tutors

Linear Subspaces and Projection in Computer Vision, Slides of Computer Vision

Alliance University Computer Vision

The concept of projecting high-dimensional images to lower-dimensional feature spaces in computer vision. It covers linear subspaces, linear projection, distance to linear subspace, and distance to affine subspace. The document also includes an application of face detection using 'distance to face space'.

Typology: Slides

2012/2013

Uploaded on 04/24/2013

baishali 🇮🇳

4

(2)

84 documents

1 / 6

This page cannot be seen from the preview

Don't miss anything!

1

CSE152, Spr 12 Intro Computer Vision

Recognition III

Introduction to Computer Vision

CSE 152

Lecture 19

CSE152, Spr 12 Intro Computer Vision

• 

.

• 

CSE152, Spr 12 Intro Computer Vision

Linear Subspaces & Linear Projection

•  An n-pixel image x∈R

d

can be

projected to a low-dimensional

feature space y∈R

k

by

y = Wx

where W is an k by d matrix.

•  Recognition is performed in R

k

using, for example, nearest neighbor.

•  How do we choose a good W?

Example: Projecting from R

3

to R

2

CSE152, Spr 12 Intro Computer Vision

Distance to Linear Subspace

•  An n-pixel image x∈R

d

can be

projected to a low-dimensional

feature space y∈R

k

by

y = Wx

•  From y ∈ R

k

, the reconstruction

of the point in R

d

is W

T

y=W

T

Wx

•  The error of the reconstruction,

or the distance from x to the

subspace spanned by W is:

||x-W

T

Wx||

CSE152, Spr 12 Intro Computer Vision

Distance to Affine Subspace

(i.e., Distance to Face Space)

•  Represented by mean vector µ and

basis images W

• An n-pixel image x∈R

d

can be

projected to a low-dimensional

feature space y∈R

k

by

y = W(x-µ)

•  From y ∈ R

k

, the reconstruction of

the point in R

d

is

W

T

y+µ= W

T

W(x-µ)+µ

•  The error of the reconstruction, or

the distance from x to the affine is:

||x-W

T

W(x-µ)-µ||=||(I-W

T

W)(x-µ)||

x

1

x

2

x

3

x

y

µ

CSE152, Spr 12 Intro Computer Vision

Application 1:

Face detection using “distance to face space”

•  Scan a window

ω

across the image, and

classify the window as face/not face as

follows:

•  Project window to subspace, and

reconstruct as described earlier.

•  Compute distance between ω and

reconstruction.

• Local minima of distance over all image

locations less than some threshold are

taken as locations of faces.

• Repeat at different scales.

• Possibly normalize windows intensity so

that |ω| = 1.

Docsity.com

Discover Slides of Computer Vision Alliance University

Partial preview of the text

Download Linear Subspaces and Projection in Computer Vision and more Slides Computer Vision in PDF only on Docsity!

CSE152, Spr 12 Intro Computer Vision

Recognition III

Introduction to Computer Vision

CSE 152

Lecture 19

CSE152, Spr 12 Intro Computer Vision

Linear Subspaces & Linear Projection

An n -pixel image x ∈ R d^ can be projected to a low-dimensional feature space y ∈ R k^ by

y = W x

where W is an k by d matrix.

Recognition is performed in R k using, for example, nearest neighbor.
How do we choose a good W?

Example: Projecting from R^3 to R^2

CSE152, Spr 12 Intro Computer Vision

Distance to Linear Subspace

An n -pixel image x ∈ R d^ can be projected to a low-dimensional feature space y ∈ R k^ by

y = W x

From y ∈ R k^ , the reconstruction of the point in R d^ is W T y=W T W x
The error of the reconstruction, or the distance from x to the subspace spanned by W is: || x - W T Wx||

CSE152, Spr 12 Intro Computer Vision

Distance to Affine Subspace

(i.e., Distance to Face Space)

Represented by mean vector μ and basis images W
An n -pixel image x ∈ R d^ can be projected to a low-dimensional feature space y ∈ R k^ by

y = W( x- μ)

From y ∈ R k^ , the reconstruction of the point in R d^ is W T y+μ= W T W( x- μ)+μ
The error of the reconstruction, or the distance from x to the affine is: || x - W T W( x- μ)-μ||= ||(I- W T W)( x- μ)||

x (^1)

x (^2)

x 3

x

y

μ

CSE152, Spr 12 Intro Computer Vision

Application 1:

Face detection using “distance to face space”

Scan a window ω across the image, and classify the window as face/not face as follows:
Project window to subspace, and reconstruct as described earlier.
Compute distance between ω and reconstruction.
Local minima of distance over all image locations less than some threshold are taken as locations of faces.
Repeat at different scales.
Possibly normalize windows intensity so that |ω| = 1.

CSE152, Spr 12 Intro Computer Vision

An important footnote:

We don’t really implement PCA by constructing a

covariance matrix!

Why?

1. How big is Σ?

• n by n where n is the number of pixels in an

image!!

2. You only need the first k Eigenvectors

CSE152, Spr 12 Intro Computer Vision

Singular Value Decomposition

Any m by n matrix A may be factored such that A = U Σ V T [m x n] = [m x m][m x n][n x n]
U : m by m , orthogonal matrix
- Columns of U are the eigenvectors of AA T
V : n by n , orthogonal matrix,
- columns are the eigenvectors of A T A
Σ : m by n , diagonal with non-negative entries (σ 1 , σ 2 , …, σs) with s=min(m,n) are called the called the singular values. SVD algorithm produces sorted singular values : σ 1 ≥ σ 2 ≥ … ≥ σs Important property
- Singular values are the square roots of Eigenvalues of both AAT and A TA & Columns of U are corresponding Eigenvectors!!

CSE152, Spr 12 Intro Computer Vision

Performing PCA with SVD

• Singular values of A are the square roots of eigenvalues

of both AAT^ and ATA & Columns of U are

corresponding Eigenvectors

• And

• Covariance matrix is:

• So, ignoring 1/n subtract mean image μ from each input

image, create a d x n data matrix, and perform thin SVD

on the data matrix. D=[x 1 -μ | x 2 -μ | … xn-μ ]

aiai^ T i = 1

n

∑ =^ [ a 1 a 2 ^ an ] [ a 1 a 2 ^ an ]

T = AAT

n

x i −

i = 1

n

∑ (^

x i −

μ) T

CSE152, Spr 12 Intro Computer Vision

PCA & Fisher’s Linear Discriminant

• Between-class scatter

• Within-class scatter

• Total scatter

• Where

c is the number of classes
μi is the mean of class χi
| χi | is number of samples of χi..

χ 1 χ^2

If the data points xi are projected by yi=Wxi and the scatter of xi is S, then the scatter of the projected points yi is WTSW

CSE152, Spr 12 Intro Computer Vision

PCA & Fisher’s Linear Discriminant

• PCA (Eigenfaces)

Maximizes projected total scatter

• Fisher’s Linear Discriminant

Maximizes ratio of projected between-class to projected within-class scatter

χ 1 χ^2

PCA

FLD

CSE152, Spr 12 Intro Computer Vision

Computing the Fisher Projection Matrix

• The wi are orthonormal

• There are at most c -1 non-zero generalized

Eigenvalues, so m ≤ c-

• Can be computed with eig in Matlab

CSE152, Spr 12 Intro Computer Vision

Recognition

CSE152, Spr 12 Intro Computer Vision

Recognition in Cluttered Scenes

Interest Points + Feature

Descriptors + Relations

CSE152, Spr 12 Intro Computer Vision

Example

Training examples

Test image

CSE152, Spr 12 Intro Computer Vision

Matching using Local Image features

Simple approach

• Detect corners in image (e.g. Harris corner

detector).

• Represent neighborhood of corner by a feature

vector produced by Gabor Filters, K-jets, SIFT

features, etc.

• Modeling: Given an training image of an object

w/o clutter, detect corners, compute feature

descriptors, store these.

• Recognition time: Given test image with possible

clutter, detect corners and compute features. Find

models with same feature descriptors (hashing)

and vote.

CSE152, Spr 12 Intro Computer Vision

Figure from “Local grayvalue invariants for image retrieval,” by C. Schmid and R. Mohr, IEEE Trans. Pattern Analysis and Machine Intelligence, 1997 copyright 1997, IEEE CSE152, Spr 12 Intro Computer Vision

Employ spatial relations

CSE152, Spr 12 Intro Computer Vision

Even without shading, shape reveals a lot

CSE152, Spr 12 Intro Computer Vision

Motion

Introduction to Computer Vision

CSE 152

Lecture 19-b

CSE152, Spr 12 Intro Computer Vision

Motion

Some problems of motion

Correspondence: Where have elements of the image moved between image frames
Reconstruction: Given correspondence, what is 3-D geometry of scene
Ego Motion: How has the camera moved.
Segmentation: What are regions of image corresponding to different moving objects
Tracking: Where have objects moved in the image? related to correspondence and segmentation.

Variations:

Small motion (video),
Wide-baseline (multi-view)

CSE152, Spr 12 Intro Computer Vision

Structure-from-Motion (SFM)

Goal: Take as input two or more images or

video w/o any information on camera

position/motion, and estimate camera

position and 3-D structure of scene.

Two Approaches

1. Discrete motion (wide baseline)

1. Orthographic (affine) vs. Perspective

2. Two view vs. Multi-view

3. Calibrated vs. Uncalibrated

2. Continuous (Infinitesimal) motion

CSE152, Spr 12 Intro Computer Vision

Discrete Motion: Some Counting

Consider M images of N points, how many unknowns?

Camera locations: Affix coordinate system to location of first camera location: (M-1)*6 Unknowns
3-D Structure: 3*N Unknowns
Can only recover structure and motion up to scale. Why?

Linear Subspaces and Projection in Computer Vision, Slides of Computer Vision

Related documents

Partial preview of the text

Download Linear Subspaces and Projection in Computer Vision and more Slides Computer Vision in PDF only on Docsity!

Recognition III

Introduction to Computer Vision

CSE 152

Lecture 19

Linear Subspaces & Linear Projection

y = W x

Distance to Linear Subspace

y = W x

Distance to Affine Subspace

(i.e., Distance to Face Space)

y = W( x- μ)

Application 1:

Face detection using “distance to face space”

An important footnote:

We don’t really implement PCA by constructing a

covariance matrix!

Why?

1. How big is Σ?

• n by n where n is the number of pixels in an

image!!

2. You only need the first k Eigenvectors

Singular Value Decomposition

Performing PCA with SVD

• Singular values of A are the square roots of eigenvalues

of both AAT^ and ATA & Columns of U are

corresponding Eigenvectors

• And

• Covariance matrix is:

• So, ignoring 1/n subtract mean image μ from each input

image, create a d x n data matrix, and perform thin SVD

on the data matrix. D=[x 1 -μ | x 2 -μ | … xn-μ ]

∑ =^ [ a 1 a 2 ^ an ] [ a 1 a 2 ^ an ]

n

x i −

∑ (^

x i −

μ) T

PCA & Fisher’s Linear Discriminant

• Between-class scatter

• Within-class scatter

• Total scatter

• Where

PCA & Fisher’s Linear Discriminant

• PCA (Eigenfaces)

• Fisher’s Linear Discriminant

PCA

FLD

Computing the Fisher Projection Matrix

• The wi are orthonormal

• There are at most c -1 non-zero generalized

Eigenvalues, so m ≤ c-

• Can be computed with eig in Matlab

Recognition

Recognition in Cluttered Scenes

Interest Points + Feature

Descriptors + Relations

Example

Matching using Local Image features

Simple approach

• Detect corners in image (e.g. Harris corner

detector).

• Represent neighborhood of corner by a feature

vector produced by Gabor Filters, K-jets, SIFT

features, etc.

• Modeling: Given an training image of an object

w/o clutter, detect corners, compute feature

descriptors, store these.

• Recognition time: Given test image with possible

clutter, detect corners and compute features. Find

models with same feature descriptors (hashing)

and vote.

Employ spatial relations

Even without shading, shape reveals a lot

Motion

Introduction to Computer Vision

CSE 152