Understanding Depth Perception: Top-Down Analysis & High-Level Vision | Slides Linguistics

Top-Down Processing in Vision 839

References

Goldsmith, J. (1990). Autosegmental and Metrical Phonology.

Oxford: Blackwell.

Goldsmith, J. (1994). Tone languages. Encyclopedia of Language

and Linguistics. Pergamon Press.

Goldsmith, J., Ed. (1995). The Handbook of Phonological Theory.

Oxford: Blackwell.

Hyman, L. (1978). Tone and/or accent. In D. J. Napoli, Ed., Ele-

ments of Tone, Stress, and Intonation. Washington, DC: Geor-

getown University Press, pp. 1–20.

McCawley, J. (1978). What is a tone language? In V. A. Fromkin,

Ed., Tone: A Linguistic Survey. New York: Academic Press, pp.

113–131.

Odden, D. (1995). Tone: African languages. In J. Goldsmith, Ed.,

The Handbook of Phonological Theory. Oxford: Blackwell, pp.

444–475.

Pike, K. (1948). Tone Languages: A Technique for Determining the

Number and Type of Pitch Contrasts in a Language, with Stud-

ies in Tonemic Substitution and Fusion. University of Michigan

Publications in Linguistics, no. 4. Ann Arbor: University of

Michigan Press.

Yip, M. (1995). Tone in East Asian languages. In J. Goldsmith,

Ed., The Handbook of Phonological Theory. Oxford: Black-

well, pp. 476–494.

Further Readings

Beckman, M., and J. Pierrehumbert. (1986). Intonation structure in

Japanese and English. Phonology Yearbook 3: 255–309.

Bolinger, D. (1985). Two views of accent. Journal of Linguistics

21: 79–123.

Duanmu, S. (1996). Tone: An overview. Glot International 2: 3–10.

Fromkin, V., Ed. (1978). Tone: A Linguistic Survey. New York:

Academic Press.

Hyman, L., and R. Schuh. (1974). Universals of tone rules: Evi-

dence from West Africa. Linguistic Inquiry 5: 81–115.

Pulleyblank, D. (1986). Tone in Lexical Phonology. Dordrecht:

Reidel.

van der Hulst, H., and N. Smith, Eds. (1988). Autosegmental Stud-

ies on Pitch Accent. Dordrecht: Foris.

Top-Down Processing in Vision

Perception represents the immediate present, what is hap-

pening around us as conveyed by the pattern of light falling

on our RETINA. And yet the current pattern of light alone

cannot explain the stable, rich experience we have of our

surroundings. The problem is that each retinal image could

have arisen from any of a vast number of possible 3-D

scenes. That we rapidly perceive only one interpretation

tells us that we see far more than the immediate information

falling on our retina. The highly accurate guesses and infer-

ences that we make rapidly and unconsciously are based on

a wealth of knowledge of the world and our expectations for

the particular scene we are seeing. The influences of these

sources beyond the images on the retina are collectively

known as top-down influences.

Both top-down analyses and the complementary bottom-

up processes use local cues to assign depth to the regions of

an image. They differ in the manner in which they resolve the

ambiguity of the local cues. A bottom-up analysis, part of

MID-LEVEL VISION and SURFACE PERCEPTION, makes direct

links between local geometrical features and depth. For

example, whenever one object partially covers another, the

visible contours of the more distant object terminate at the

outer boundary of the nearer one, forming what are called T-

junctions. When a T-junction is encountered in an image, this

logic can be reversed: the stem of the T is designated a con-

tour of a more distant, partially hidden object and the top of

the T is assigned to the outer boundary of a nearer object.

A top-down process, on the other hand, depends on the

content of the image and its analysis by processes of HIGH-

LEVEL VISION. Cues operate by suggesting objects—a nose

contour might suggest a face, for example—and then stored

information about that object’s structure can be applied to

the assignment of depth in the image. Other features in the

image are then examined to verify or reject the postulated

object. The cues used for the initial selection of potential

objects are not limited to the current images but include pre-

ceding images as well as nonvisual sources which affect our

expectations for the scene. The sources of object knowledge

which are called upon may be built up over both evolution-

ary or individual time scales.

Our guesses for appropriate internal models are best

when we know what to expect in a scene. Upon opening a

door to a classroom, for example, we expect to see desks

and a black or white board. If these elements are present in

the scene, they are rapidly interpreted. Incongruent elements

are seen less reliably as Biederman (1981) showed when he

reported increased errors in identifying fire hydrants pre-

sented in kitchens or sofas floating over city streets than

when they were presented in their usual contexts. As Bied-

erman’s example demonstrates, top-down analyses work

because there is a great deal of semantic redundancy in the

content of a scene—noses are expected to be seen along

with mouths, cars with roads, classrooms with desks, and

sofas with coffee tables; moreover, noses, cars, and sofas

have typical shapes so that once a few distinctive features

have implied the presence of say, a car, the other expected

features of a car can be verified or even just assumed to be

present.

Textbook examples of top-down processing typically

make use of images with two or more equally likely inter-

pretations which are sometimes referred to as ILLUSIONS. A

hint as to which interpretation to see may then trigger one or

the other, as in the examples shown here. (a) Two faces, or

one vase, or one face behind a vase (Costall 1980); (b) a man

playing a saxophone seen in silhouette, or a woman’s face in

sharp shadow (Shepard 1990); and (c) a sphere in a four-

point setting or a white angel (Tse 1998). In these instances,

Figure 1.

AuQ:

Location?

Understanding Depth Perception: Top-Down Analysis & High-Level Vision, Slides of Linguistics

Related documents

Partial preview of the text

Download Understanding Depth Perception: Top-Down Analysis & High-Level Vision and more Slides Linguistics in PDF only on Docsity!

References

Further Readings

Top-Down Processing in Vision

References

Transparency