



Study with the several resources on Docsity
Earn points by helping other students or get them with a premium plan
Prepare for your exams
Study with the several resources on Docsity
Earn points to download
Earn points by helping other students or get them with a premium plan
Material Type: Notes; Professor: Potetz; Class: Computer Vision; Subject: Elect Engr & Computer Science; University: University of Kansas; Term: Fall 2008;
Typology: Study notes
1 / 5
This page cannot be seen from the preview
Don't miss anything!




field of view of stereo one pixel uncertainty of scenepoint
8/)*$",Z3C$H$012G-)3I$'YY[
For each epipolar line For each pixel in the left image
W = 3 W = 20
N points have N! possible correspondences. BUT , we might assume the ordering of points is the same in the left & right eyes. Now, solve for the best matching of p4 given p3, etc. a c d e f g k Left scanline i Right scanline a^ c^ f^ g^ j^ k h b M L R R R M L L M M M Figure 2: Stereo matching using dynamic programming. For each pair of corresponding scanlines, a minimizing path through the matrix of all pairwise matching costs is selected. Lowercase letters ( a – k ) symbolize the intensities along each scanline. Uppercase letters represent the selected path through the matrix. Matches are indicated by M , while partially occluded points (which have a fixed cost) are indicated by L and R , corresponding to points only visible in the left and right image, respectively. Usually, only a limited disparity range is considered, which is 0–4 in the figure (indicated by the non-shaded squares). Note that this diagram shows an “unskewed” x - d slice through the DSI. ger disparities (exceptions include continuous optimization techniques such as optic flow [11] or splines [112]). For ap- plications such as robot navigation or people tracking, these may be perfectly adequate. However for image-based ren- dering, such quantized maps lead to very unappealing view synthesis results (the scene appears to be made up of many thin shearing layers). To remedy this situation, many al- gorithms apply a sub-pixel refinement stage after the initial discrete correspondence stage. (An alternative is to simply start with more discrete disparity levels.) Sub-pixel disparity estimates can be computed in a va- riety of ways, including iterative gradient descent and fit- ting a curve to the matching costs at discrete disparity lev- els [93, 71, 122, 77, 60]. This provides an easy way to increase the resolution of a stereo algorithm with little addi- tional computation. However, to work well, the intensities being matched must vary smoothly, and the regions over which these estimates are computed must be on the same (correct) surface. Recently, some questions have been raised about the ad- visability of fitting correlation curves to integer-sampled matching costs [105]. This situation may even be worse when sampling-insensitive dissimilarity measures are used [12]. We investigate this issue in Section 6.4 below. Besides sub-pixel computations, there are of course other ways of post-processing the computed disparities. Occluded areas can be detected using cross-checking (comparing left- to-right and right-to-left disparity maps) [29, 42]. A median filter can be applied to “clean up” spurious mismatches, and holes due to occlusion can be filled by surface fitting or by distributing neighboring disparity estimates [13, 96]. In our implementation we are not performing such clean-up steps since we want to measure the performance of the raw algorithm components. 3.5. Other methods Not all dense two-frame stereo correspondence algorithms can be described in terms of our basic taxonomy and rep- resentations. Here we briefly mention some additional al- gorithms and representations that are not covered by our framework. The algorithms described in this paper first enumerate all possible matches at all possible disparities, then select the best set of matches in some way. This is a useful approach when a large amount of ambiguity may exist in the com- puted disparities. An alternative approach is to use meth- ods inspired by classic (infinitesimal) optic flow computa- tion. Here, images are successively warped and motion esti- mates incrementally updated until a satisfactory registration is achieved. These techniques are most often implemented within a coarse-to-fine hierarchical refinement framework [90, 11, 8, 112]. A univalued representation of the disparity map is also not essential. Multi-valued representations, which can rep- resent several depth values along each line of sight, have been extensively studied recently, especially for large multi- view data set. Many of these techniques use a voxel-based representation to encode the reconstructed colors and spatial occupancies or opacities [113, 101, 67, 34, 33, 24]. Another way to represent a scene with more complexity is to use mul- tiple layers, each of which can be represented by a plane plus residual parallax [5, 14, 117]. Finally, deformable surfaces of various kinds have also been used to perform 3D shape reconstruction from multiple images [120, 121, 43, 38]. 3.6. Summary of methods Table 1 gives a summary of some representative stereo matching algorithms and their corresponding taxonomy, i.e., the matching cost, aggregation, and optimization techniques used by each. The methods are grouped to contrast different matching costs (top), aggregation methods (middle), and op- timization techniques (third section), while the last section lists some papers outside the framework. As can be seen from this table, quite a large subset of the possible algorithm design space has been explored over the years, albeit not very systematically.
We have developed a stand-alone, portable C++ implemen- tation of several stereo algorithms. The implementation is closely tied to the taxonomy presented in Section 3 and cur- rently includes window-based algorithms, diffusion algo- 6
%
d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))
∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) 1 d(x, y), y)) ∆#x), R(#x + ∆#x)) #x)) #x = (x, y)
Y
d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))
∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) 1
#1195/1-F$13$=)9)/$)Q$ =.3-/59$+1O.9$O$53D$+1O.9$^O$_$`Oa #1195/1-F$13$=)9)/$)Q$+1O. 13$9.]$H$/1CG-$15C. #,$)Q$599$B.1CG- d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =
x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y))
';
d : R^2 → R E(d) = Edata(d) + λE Edata(d) = ∑ x,y C(Ilef t(x, Esmooth = ∑ x,y φ(d(x, y +
∑ x,y φ(d(x, y +
R12+5/1-FI$52$5$ Q,3=A)3$)Q$^OIFa d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))
∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d)
'' -100 -50 50 100 u-d 6 8 10 12 Error -100 -50 50 100 u-d -0. -0. -0.
0.15^ Influence a b Figure 4: Quadratic -function and -function. -6 -4 -2 2 4 6 1 2 3 4 5 6 7 -6 -4 -2 2 4 6
-0.
1 a b Figure 5: Huber’s min-max estimator. -function, -function. d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =
x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) -6 -4 -2 2 4 6 1 2 3 4 5 -6 -4 -2 2 4 6 -1.
-0.
1
a b Figure 6: Lorentzian. -function, -function. -6 -4 -2 2 4 6 2 4 6 8 10 -6 -4 -2 2 4 6
2 4 6 a b d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =
x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) b5,22153$M++/)5=G <)/.3-7153$M++/)5=G d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) = ∑ x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth = ∑ x,y φ(d(x + 1, y) − d(x, y))
∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d)
': -100 -50 50 100 u-d 6 8 10 12 Error -100 -50 50 100 u-d -0. -0. -0.
0.15^ Influence a b Figure 4: Quadratic -function and -function. -6 -4 -2 2 4 6 1 2 3 4 5 6 7 -6 -4 -2 2 4 6
-0.
1 a b Figure 5: Huber’s min-max estimator. -function, -function. d : R^2 → R E(d) = Edata(d) + λEsmooth(d) Edata(d) =
x,y C(Ilef t(x, y), Iright(x + d(x, y), y)) Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) -6 -4 -2 2 4 6 1 2 3 4 5 -6 - a Figure 6: Lorentzian. -function, -6 -4 -2 2 4 6 2 4 6 8 10 -6 - a d : R^2 → R E(d) = Edata(d) + λEsmooth( Edata(d) =
x,y C(Ilef t(x, y), Irig Esmooth =
x,y φ(d(x + 1, y) − d(
x,y φ(d(x, y + 1) − d( φ(∆d) b5,22153$M++/)5=G <)/.3-7153$M++/)5=G #)*.$2G5/+$D12=)3A3,1A.2$ 5/.$@c$^)==9,21)3$=)3-),/2a
'Y x,y Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) #x ri = f ′^ r 0 r 0 · z vi = ∂ri ∂t = f ′^ (ro · z)v 0 − (v 0 · z)r 0 (r 0 · z)^2 1 Esmooth =
x,y φ(d(x + 1, y) − d(x, y))
x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆#x), R(#x^ +^ ∆#x)) ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x)) #x = (x, y) ri = f ′^ r 0 r 0 · z vi = ∂ri ∂t = f ′^ (ro · z)v 0 − (v 0 · z)r 0 (r 0 · z)^2 1 Motion Field
Optical Flow Constraint Equation
∑ x,y φ(d(x, y + 1) − d(x, y)) φ(∆d) ∑ ∆!x w(L(#x), L ∑(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆#x))d(L(#x^ +^ ∆ ∆!x w(L(#x), L(#x^ +^ ∆#x))w(R(#x), R(#x^ +^ ∆ ri = f ′^ rr^0 0 ·^ z vi = ∂ ∂rti = f ′^ (ro^ ·^ z) (vr^0 −^ (v^0 ·^ z 0 ·^ z)^2 δt → 0 ≈ 1