Stereo Matching and Optical Flow Estimation Techniques - Prof. Brian Potetz | Study notes Electrical and Electronics Engineering

!!"#$%&'($")*+,-./$0121)3

4/153$6)-.-7

8599$:;;%

<.=-,/.$>:?($@+A=59$89)B

#,CC.2-.D$E.5D13C

F)A)3(

•#G5+1/)$H$#-)=I*53J$"G5+-./$KL

Dynamic Programming

N points have N! possible correspondences.

BUT, we might assume the ordering of points is the same

in the left & right eyes.

Now, solve for the best matching of p4 given p3, etc.

c d e f g ka

Left scanline

Right scanline

a c f g jkh

LLM

Figure 2: Stereo matching using dynamic programming. For each

pair of corresponding scanlines, a minimizing path through the

matrix of all pairwise matching costs is selected. Lowercase letters

(a–k) symbolize the intensities along each scanline. Uppercase

letters represent the selected path through the matrix. Matches

are indicated by M, while partially occluded points (which have

a fixed cost) are indicated by Land R, corresponding to points

only visible in the left and right image, respectively. Usually, only

a limited disparity range is considered, which is 0–4 in the figure

(indicated by the non-shaded squares). Note that this diagram

shows an “unskewed” x-dslice through the DSI.

ger disparities (exceptions include continuous optimization

techniques such as optic flow [11] or splines [112]). For ap-

plications such as robot navigation or people tracking, these

may be perfectly adequate. However for image-based ren-

dering, such quantized maps lead to very unappealing view

synthesis results (the scene appears to be made up of many

thin shearing layers). To remedy this situation, many al-

gorithms apply a sub-pixel refinement stage after the initial

discrete correspondence stage. (An alternative is to simply

start with more discrete disparity levels.)

Sub-pixel disparity estimates can be computed in a va-

riety of ways, including iterative gradient descent and fit-

ting a curve to the matching costs at discrete disparity lev-

els [93, 71, 122, 77, 60]. This provides an easy way to

increase the resolution of a stereo algorithm with little addi-

tional computation. However, to work well, the intensities

being matched must vary smoothly, and the regions over

which these estimates are computed must be on the same

(correct) surface.

Recently, some questions have been raised about the ad-

visability of fitting correlation curves to integer-sampled

matching costs [105]. This situation may even be worse

when sampling-insensitive dissimilarity measures are used

[12]. We investigate this issue in Section 6.4 below.

Besides sub-pixel computations, there are of course other

ways of post-processing the computed disparities. Occluded

areas can be detected using cross-checking (comparing left-

to-right and right-to-left disparity maps) [29, 42]. A median

filter can be applied to “clean up” spurious mismatches, and

holes due to occlusion can be filled by surface fitting or

by distributing neighboring disparity estimates [13, 96]. In

our implementation we are not performing such clean-up

steps since we want to measure the performance of the raw

algorithm components.

3.5. Other methods

Not all dense two-frame stereo correspondence algorithms

can be described in terms of our basic taxonomy and rep-

resentations. Here we briefly mention some additional al-

gorithms and representations that are not covered by our

framework.

The algorithms described in this paper first enumerate all

possible matches at all possible disparities, then select the

best set of matches in some way. This is a useful approach

when a large amount of ambiguity may exist in the com-

puted disparities. An alternative approach is to use meth-

ods inspired by classic (infinitesimal) optic flow computa-

tion. Here,images are successively warped and motion esti-

mates incrementally updated until a satisfactory registration

is achieved. These techniques are most often implemented

within a coarse-to-fine hierarchical refinement framework

[90, 11, 8, 112].

A univalued representation of the disparity map is also

not essential. Multi-valued representations, which can rep-

resent several depth values along each line of sight, have

been extensively studied recently,especially for large multi-

view data set. Many of these techniques use a voxel-based

representation to encode the reconstructed colors and spatial

occupancies or opacities [113, 101, 67, 34, 33, 24]. Another

way to represent a scene with more complexity is to use mul-

tiple layers, each of which can be represented by a plane plus

residual parallax [5, 14, 117]. Finally, deformable surfaces

of various kinds have also been used to perform 3D shape

reconstruction from multiple images [120, 121, 43, 38].

3.6. Summary of methods

Table 1 gives a summary of some representative stereo

matching algorithms and their corresponding taxonomy,i.e.,

the matching cost, aggregation, and optimization techniques

used by each. The methods are grouped to contrast different

matching costs (top), aggregation methods (middle), and op-

timization techniques (third section), while the last section

lists some papers outside the framework. As can be seen

from this table, quite a large subset of the possible algorithm

design space has been explored over the years, albeit not

very systematically.

4. Implementation

We have developeda stand-alone, portable C++ implemen-

tation of several stereo algorithms. The implementation is

closely tied to the taxonomy presented in Section 3 and cur-

rently includes window-based algorithms, diffusion algo-

d:R2→R

E(d)=Edata(d)+λEsmooth(d)

Edata(d)=!

x,y

C(Ileft (x, y),I

right(x+d(x, y ),y))

Esmooth =!

x,y

φ(d(x+1,y)−d(x,y ))

x,y

φ(d(x, y + 1) −d(x, y))

φ(∆d)

"∆!xw(L(#x),L(#x+∆#x))w(R(#x),R(#x+∆#x))d(L(#x+∆#x),R(#x+∆#x))

"∆!xw(L(#x),L(#x+∆#x))w(R(#x),R(#x+∆#x))

#x=(x, y)

F)D./3$#-./.)$M9C)/1-G*2

•@N.3$,2.$=)9)/OP52.D$5D5+AQ.$B13D)B2

d:R2→R

E(d)=Edata(d)+λEsmooth(d)

Edata(d)=!

x,y

C(Ileft (x, y),I

right(x+d(x, y ),y))

Esmooth =!

x,y

φ(d(x+1,y)−d(x,y ))

x,y

φ(d(x, y + 1) −d(x, y))

φ(∆d)

"∆!xw(L(#x),L(#x+∆#x))w(R(#x),R(#x+∆#x))d(L(#x+∆#x),R(#x+∆#x))

"∆!xw(L(#x),L(#x+∆#x))w(R(#x),R(#x+∆#x))

MD5+AQ.$*.-/1=$)R$.//)/$P.-B..3$1*5C.$+5-=G$<$53D$E(

#1*195/1-S$13$=)9)/$)R$

=.3-/59$+1T.9$T$53D$+1T.9$UT$V$WTX

#1*195/1-S$13$=)9)/$)R$+1T.9

13$9.N$H$/1CG-$1*5C.2

#,*$)R$599$B.1CG-2

d:R2→R

E(d)=Edata(d)+λEsmooth(d)

Edata(d)=!

x,y

C(Ileft (x, y),I

right(x+d(x, y ),y))

Esmooth =!

x,y

φ(d(x+1,y)−d(x,y))

x,y

φ(d(x, y + 1) −d(x, y ))

F)D./3$#-./.)$M9C)/1-G*2

1L.L$F131*17.(

d:R2→R

E(d)=Edata(d)+λEsmooth(d)

Edata(d)=!

x,y

C(Ileft (x, y),I

right(x+d(x, y ),y))

Esmooth =!

x,y

φ(d(x, y + 1) −d(x, y))

x,y

φ(d(x, y + 1) −d(x, y))

•@N.3$,2.$=)9)/OP52.D$5D5+AQ.$B13D)B2

•"9.53$,+$/.2,9-2$,213C$5$Z9)P59$!3./CS$8,3=A)3

[12+5/1-SJ$52$5$

R,3=A)3$)R$UTJSX

Stereo Matching and Optical Flow Estimation Techniques - Prof. Brian Potetz, Study notes of Electrical and Electronics Engineering

Related documents

Partial preview of the text

Download Stereo Matching and Optical Flow Estimation Techniques - Prof. Brian Potetz and more Study notes Electrical and Electronics Engineering in PDF only on Docsity!

<.=-,/.$>:?($@+A=59$89)B

#,CC.2-.D$E.5D13C

F)A)3(

• #G5+1/)$H$#-)=I*53J$"G5+-./$KL

Dynamic Programming

4. Implementation

F)D./3$#-./.)$M9C)/1-G*

• @N.3$,2.$=)9)/OP52.D$5D5+AQ.$B13D)B

MD5+AQ.$.-/1=$)R$.//)/$P.-B..3$15C.$+5-=G$<$53D$E(

F)D./3$#-./.)$M9C)/1-G*

1L.L$F131*17.(

• @N.3$,2.$=)9)/OP52.D$5D5+AQ.$B13D)B

• "9.53$,+$/.2,9-2$,213C$5$Z9)P59$!3./CS$8,3=A)