Lecture Slides on Optical Flow – Computer Vision | EECS 841, Study notes of Electrical and Electronics Engineering

Material Type: Notes; Professor: Potetz; Class: Computer Vision; Subject: Elect Engr & Computer Science; University: University of Kansas; Term: Fall 2008;

Typology: Study notes

Pre 2010

Uploaded on 03/19/2009

koofers-user-iu7
koofers-user-iu7 🇺🇸

8 documents

1 / 5

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
!!"#$%&'($")*+,-./$0121)3
4/153$6)-.-7
8599$:;;%
<.=-,/.$>:?($@+A=59$89)B
#,CC.2-.D$E.5D13C
#-./.)(
8)/2F-G$H$6)3=.I$"G5+-./$''J
#=G5/2-.13$H$#7.912K1I$LM$N5O)3)*F$53D$
!P59,5A)3$)Q$R.32.$NB)S8/5*.$#-./.)$
")//.2+)3D.3=.$M9C)/1-G*2IT$:;;:J
GU+(VVP121)3J*1DD9.W,/FJ.D,V2-./.)V
X)A)3(
#G5+1/)$H$#-)=K*53I$"G5+-./$YJ
Vergence
field of view
of stereo
one pixel
uncertainty of
scenepoint
Field of view decreases with increase in baseline and vergence
Accuracy increases with baseline and vergence
Rectified Stereo Pair Unrectified Stereo Pair
#-/.3C-G$@Q$#-./.)$",.2
8/)*$",Z3C$H$012G-)3I$'YY[
Basic Stereo Algorithm
For each epipolar line
For each pixel in the left image
compare with every pixel on same epipolar line in right image
pick pixel with minimum match cost
Improvement: match windows
This should look familar...
Correlation, Sum of Squared Difference (SSD), etc.
Size of Matching window
W = 3 W = 20
#*599./$B13D)B2$5/.$2.321AP.$-)$3)12.
<5/C./$B13D)B2$D)$3)-$9)=5917.$B.99
E.*13D$F),$)Q$.DC.$D.-.=A)3\
Better results with adaptive window
D. Scharstein and R. Szeliski. Stereo
matching with nonlinear diffusion. IJVC,
28(2):155-174, July 1998
T. Kanade and M. Okutomi, A Stereo
Matching Algorithm with an Adaptive
Window: Theory and Experiment, 1991.
pf3
pf4
pf5

Partial preview of the text

Download Lecture Slides on Optical Flow – Computer Vision | EECS 841 and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

<.=-,/.$>:?($@+A=59$89)B

#,CC.2-.D$E.5D13C

• 8)/2F-G$H$6)3=.I$"G5+-./$''J

• #=G5/2-.13$H$#7.912K1I$LM$N5O)3)*F$53D$

!P59,5A)3$)Q$R.32.$NB)S8/5*.$#-./.)$

")//.2+)3D.3=.$M9C)/1-G*2IT$:;;:J

• GU+(VVP121)3J*1DD9.W,/FJ.D,V2-./.)V

X)A)3(

• #G5+1/)$H$#-)=K*53I$"G5+-./$YJ

Vergence

field of view of stereo one pixel uncertainty of scenepoint

  • Field of view decreases with increase in baseline and vergence
  • Accuracy increases with baseline and vergence Rectified Stereo Pair Unrectified Stereo Pair

#-/.3C-G$@Q$#-./.)$",.

8/)*$",Z3C$H$012G-)3I$'YY[

Basic Stereo Algorithm

For each epipolar line For each pixel in the left image

  • compare with every pixel on same epipolar line in right image
  • pick pixel with minimum match cost Improvement: match windows
  • This should look familar...
  • Correlation, Sum of Squared Difference (SSD), etc.

Size of Matching window

W = 3 W = 20

  • #*599./$B13D)B2$5/.$2.321AP.$-)$3)12.
  • <5/C./$B13D)B2$D)$3)-$9)=5917.$B.
  • E.*13D$F),$)Q$.DC.$D.-.=A)3\
  • Better results with adaptive window D. Scharstein and R. Szeliski. matching with nonlinear diffusion Stereo. IJVC, 28(2):155-174, July 1998 T. Kanade and M. Okutomi, Matching Algorithm with an Adaptive A Stereo Window: Theory and Experiment , 1991.

Dynamic Programming

N points have N! possible correspondences. BUT , we might assume the ordering of points is the same in the left & right eyes. Now, solve for the best matching of p4 given p3, etc. a c d e f g k Left scanline i Right scanline a^ c^ f^ g^ j^ k h b M L R R R M L L M M M Figure 2: Stereo matching using dynamic programming. For each pair of corresponding scanlines, a minimizing path through the matrix of all pairwise matching costs is selected. Lowercase letters ( a k ) symbolize the intensities along each scanline. Uppercase letters represent the selected path through the matrix. Matches are indicated by M , while partially occluded points (which have a fixed cost) are indicated by L and R , corresponding to points only visible in the left and right image, respectively. Usually, only a limited disparity range is considered, which is 0–4 in the figure (indicated by the non-shaded squares). Note that this diagram shows an “unskewed” x - d slice through the DSI. ger disparities (exceptions include continuous optimization techniques such as optic flow [11] or splines [112]). For ap- plications such as robot navigation or people tracking, these may be perfectly adequate. However for image-based ren- dering, such quantized maps lead to very unappealing view synthesis results (the scene appears to be made up of many thin shearing layers). To remedy this situation, many al- gorithms apply a sub-pixel refinement stage after the initial discrete correspondence stage. (An alternative is to simply start with more discrete disparity levels.) Sub-pixel disparity estimates can be computed in a va- riety of ways, including iterative gradient descent and fit- ting a curve to the matching costs at discrete disparity lev- els [93, 71, 122, 77, 60]. This provides an easy way to increase the resolution of a stereo algorithm with little addi- tional computation. However, to work well, the intensities being matched must vary smoothly, and the regions over which these estimates are computed must be on the same (correct) surface. Recently, some questions have been raised about the ad- visability of fitting correlation curves to integer-sampled matching costs [105]. This situation may even be worse when sampling-insensitive dissimilarity measures are used [12]. We investigate this issue in Section 6.4 below. Besides sub-pixel computations, there are of course other ways of post-processing the computed disparities. Occluded areas can be detected using cross-checking (comparing left- to-right and right-to-left disparity maps) [29, 42]. A median filter can be applied to “clean up” spurious mismatches, and holes due to occlusion can be filled by surface fitting or by distributing neighboring disparity estimates [13, 96]. In our implementation we are not performing such clean-up steps since we want to measure the performance of the raw algorithm components. 3.5. Other methods Not all dense two-frame stereo correspondence algorithms can be described in terms of our basic taxonomy and rep- resentations. Here we briefly mention some additional al- gorithms and representations that are not covered by our framework. The algorithms described in this paper first enumerate all possible matches at all possible disparities, then select the best set of matches in some way. This is a useful approach when a large amount of ambiguity may exist in the com- puted disparities. An alternative approach is to use meth- ods inspired by classic (infinitesimal) optic flow computa- tion. Here, images are successively warped and motion esti- mates incrementally updated until a satisfactory registration is achieved. These techniques are most often implemented within a coarse-to-fine hierarchical refinement framework [90, 11, 8, 112]. A univalued representation of the disparity map is also not essential. Multi-valued representations, which can rep- resent several depth values along each line of sight, have been extensively studied recently, especially for large multi- view data set. Many of these techniques use a voxel-based representation to encode the reconstructed colors and spatial occupancies or opacities [113, 101, 67, 34, 33, 24]. Another way to represent a scene with more complexity is to use mul- tiple layers, each of which can be represented by a plane plus residual parallax [5, 14, 117]. Finally, deformable surfaces of various kinds have also been used to perform 3D shape reconstruction from multiple images [120, 121, 43, 38]. 3.6. Summary of methods Table 1 gives a summary of some representative stereo matching algorithms and their corresponding taxonomy, i.e., the matching cost, aggregation, and optimization techniques used by each. The methods are grouped to contrast different matching costs (top), aggregation methods (middle), and op- timization techniques (third section), while the last section lists some papers outside the framework. As can be seen from this table, quite a large subset of the possible algorithm design space has been explored over the years, albeit not very systematically.

4. Implementation

We have developed a stand-alone, portable C++ implemen- tation of several stereo algorithms. The implementation is closely tied to the taxonomy presented in Section 3 and cur- rently includes window-based algorithms, diffusion algo- 6

X)D./3$#-./.)$M9C)/1-G*

%

• @].3$,2.$=)9)/SW52.D$5D5+AP.$B13D)B

MD5+AP.$.-/1=$)Q$.//)/$W.-B..3$15C.$+5-=G$<$53D$E(

d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))

∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) 1 d(x, y), y)) ∆#x), R(#x + ∆#x)) #x)) #x = (x, y)

X)D./3$#-./.)$M9C)/1-G*

Y

• @].3$,2.$=)9)/SW52.D$5D5+AP.$B13D)B

d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))

∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) 1

MD5+AP.$.-/1=$)Q$.//)/$W.-B..3$15C.$+5-=G$<$53D$E(

#1195/1-F$13$=)9)/$)Q$ =.3-/59$+1O.9$O$53D$+1O.9$^O$_$`Oa #1195/1-F$13$=)9)/$)Q$+1O. 13$9.]$H$/1CG-$15C. #,$)Q$599$B.1CG- d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =

x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y))

X)D./3$#-./.)$M9C)/1-G*

';

1J.J$X131*17.(

d : R^2 → R E(d) = Edata(d) + λE Edata(d) = ∑ x,y C(Ilef t(x, Esmooth = ∑ x,y φ(d(x, y +

∑ x,y φ(d(x, y +

• @].3$,2.$=)9)/SW52.D$5D5+AP.$B13D)B

• "9.53$,+$/.2,9-2$,213C$5$b9)W59$!3./CF$8,3=A)

R12+5/1-FI$52$5$ Q,3=A)3$)Q$^OIFa d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))

∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d)

#*))-G3.22$6/1)/

'' -100 -50 50 100 u-d 6 8 10 12 Error -100 -50 50 100 u-d -0. -0. -0.

0.15^ Influence a b Figure 4: Quadratic -function and -function. -6 -4 -2 2 4 6 1 2 3 4 5 6 7 -6 -4 -2 2 4 6

-0.

1 a b Figure 5: Huber’s min-max estimator. -function, -function. d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =

x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) -6 -4 -2 2 4 6 1 2 3 4 5 -6 -4 -2 2 4 6 -1.

-0.

1

a b Figure 6: Lorentzian. -function, -function. -6 -4 -2 2 4 6 2 4 6 8 10 -6 -4 -2 2 4 6

2 4 6 a b d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =

x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) b5,22153$M++/)5=G <)/.3-7153$M++/)5=G d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))

∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d)

#*))-G3.22$6/1)/

': -100 -50 50 100 u-d 6 8 10 12 Error -100 -50 50 100 u-d -0. -0. -0.

0.15^ Influence a b Figure 4: Quadratic -function and -function. -6 -4 -2 2 4 6 1 2 3 4 5 6 7 -6 -4 -2 2 4 6

-0.

1 a b Figure 5: Huber’s min-max estimator. -function, -function. d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =

x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) -6 -4 -2 2 4 6 1 2 3 4 5 -6 - a Figure 6: Lorentzian. -function, -6 -4 -2 2 4 6 2 4 6 8 10 -6 - a d : R^2 → R E(d) = Edata(d) + λEsmooth( Edata(d) =

x,y C(Ilef t(x, y), Irig Esmooth =

x,y φ(d(x + 1, y) − d(

x,y φ(d(x, y + 1) − d( φ(∆d) b5,22153$M++/)5=G <)/.3-7153$M++/)5=G #)*.$2G5/+$D12=)3A3,1A.2$ 5/.$@c$^)==9,21)3$=)3-),/2a

'Y x,y Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) #x ri = f ′^ r 0 r 0 · z vi = ∂ri ∂t = f ′^ (ro · z)v 0 − (v 0 · z)r 0 (r 0 · z)^2 1 Esmooth =

x,y φ(d(x + 1, y) − d(x, y))

x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) #x = (x, y) ri = f ′^ r 0 r 0 · z vi = ∂ri ∂t = f ′^ (ro · z)v 0 − (v 0 · z)r 0 (r 0 · z)^2 1 Motion Field

  • Image velocity of a point moving in the scene Perspective projection: Motion field Scene point velocity: Image velocity: Optical Flow
  • Motion of brightness pattern in the image
  • Ideally Optical flow = Motion field Optical Flow Motion Field Motion field exists but no optical flow No motion field but shading changes Problem Definition: Optical Flow
  • How to estimate pixel motion from image H to image I?
  • Find pixel correspondences
  • Given a pixel in H, look for nearby pixels of the same color in I
  • Key assumptions
  • color constancy: a point in H looks “the same” in image I
  • For grayscale images, this is brightness constancy
  • small motion : points do not move very far Optical Flow Constraint Equation Optical Flow: Velocities Displacement:

Optical Flow Constraint Equation

  • Assume brightness of patch remains same in both images: Optical Flow: Velocities Displacement: Optical Flow Constraint Equation
  • Assume brightness of patch remains same in both images:
  • Assume small motion: (First order Taylor expansion of E) Optical Flow: Velocities Displacement: Edata(d) = x,y C(Ilef t(x, y), Iright(x + Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))

∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆ ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆ ri = f ′^ rr^0 0 ·^ z vi = ∂ ∂rti = f ′^ (ro^ ·^ z) (vr^0 −^ (v^0 ·^ z 0 ·^ z)^2 δt → 0 ≈ 1