Understanding Depth and Object Size: Cues and Techniques, Lecture notes of Psychology

The various cues and techniques our brain uses to perceive depth and object size from 2D retinal images. It covers oculomotor, binocular, and monocular cues, including convergence, accommodation, occlusion, relative height, and familiar size. The document also discusses the importance of atmospheric perspective, linear perspective, texture gradient, and motion parallax in depth perception.

Typology: Lecture notes

2021/2022

Uploaded on 09/12/2022

newfound
newfound 🇨🇦

4.5

(13)

362 documents

1 / 9

Toggle sidebar

This page cannot be seen from the preview

Don't miss anything!

bg1
Chapter 7
Perceiving Depth and Size
In this chapter we will describe how we perceive depth and sizes. As we mentioned before, the problem of finding
depth from 2D retinal images is an under-determined problem. However, the human eye achieves this great ease in
almost no time. The first question to ask is what information contained in this 2D image makes us perceive depth.
This is called the cue approach. This approach focuses on identifying the elements of information in the 2D scene
that is related to depth of the scene. According to cue theory, we make a connection between these cues and the
actual depth by our accumulated knowledge from prior theories. So, these include several heuristics which may
confuse us in pathological cases as we will see later in this chapter.
The depth cues can be divided in three different categories.
1. Oculomotor: These are cues based on the ability to sense the position of our eyes and the tension in the eye
muscles.
2. Monocular: Cues that work with one eye.
3. Binocular: Cues that depend on two frontal eyes.
Figure 7.1: From left: Convergence of eyes when looking at nearby objects (1) and non-converged state when the
eye looks far away (2), relative height cues (3)
97
pf3
pf4
pf5
pf8
pf9

Partial preview of the text

Download Understanding Depth and Object Size: Cues and Techniques and more Lecture notes Psychology in PDF only on Docsity!

Chapter 7

Perceiving Depth and Size

In this chapter we will describe how we perceive depth and sizes. As we mentioned before, the problem of finding depth from 2D retinal images is an under-determined problem. However, the human eye achieves this great ease in almost no time. The first question to ask is what information contained in this 2D image makes us perceive depth. This is called the cue approach. This approach focuses on identifying the elements of information in the 2D scene that is related to depth of the scene. According to cue theory, we make a connection between these cues and the actual depth by our accumulated knowledge from prior theories. So, these include several heuristics which may confuse us in pathological cases as we will see later in this chapter. The depth cues can be divided in three different categories.

  1. Oculomotor: These are cues based on the ability to sense the position of our eyes and the tension in the eye muscles.
  2. Monocular: Cues that work with one eye.
  3. Binocular: Cues that depend on two frontal eyes.

Figure 7.1: From left: Convergence of eyes when looking at nearby objects (1) and non-converged state when the eye looks far away (2), relative height cues (3)

98 CHAPTER 7. PERCEIVING DEPTH AND SIZE

7.1 Oculomotor Cues

Oculomotor cues are due to convergence and accommodation. When looking at nearby objects our eyes move inwards. This is called convergence. and when looking at far away objects, our eyes move outward, as shown in Figure 7.1. This associated with accommodation. When the eyes move inward there is a tightening of the muscles that hold the lens to to get the eye to focus close by. When the eye moves outward, muscles relax. We can sense both of these which gives us an estimate about the depth of the object. You can experience this by moving an object at your arm’s length closer and closer to the eye to feel the tightening of the muscles and then move it back again to feel the muscles relax.

Figure 7.2: Left: Occlusion Cues, Middle: Contradicting Occlusion and Relative Height Cues, Right: Shadows resolving the contradiction.

7.2 Monocular Cues

Figure 7.3: Left: Relative size cues. Right: Familiar size cues.

Monocular cues are the ones that are obtained from the 2D image of only one eye. These include the following.

  1. Occlusion: When one object is hidden fully or partially, this hidden (occluded) object is considered to be farther away, behind object which is covering it (occluding object). Figure 7.2 illustrates occlusion.
  2. Relative Cues: The second cue is that of relative height. The objects that are farther away have higher bases in the image as shown in Figure 7.1. The three men in this figure are perceived at different depths and the

100 CHAPTER 7. PERCEIVING DEPTH AND SIZE

frequently. These movements cause motion parallax by which nearby objects tend to cross the retinal image plane faster than the distant objects which serve as an important depth cue. This is the same phenomenon that makes nearby objects from a train speedily pass by in a blur, but distant objects move slightly. This is illustrated in Figure 7.5. When the eye moves to the right, the nearby point A moves from A 1 to A 2. The distance moved by the faraway point B, given by B 1 to B 2 is much smaller. This also lead to deletion and accretion of objects as shown in Figure 7.5. As we move, parts of objects gets revealed or occluded. The rate of these deletion and accretion provide information about the depth of the objects.

One important point to note here is that monocular cues do not provide absolute information about depth, but relative depth with respect to other objects in the environment.

Figure 7.5: Left: Motion Parallex. Right: Motion parallax causing deletion and accretion of objects.

7.3 Binocular Cues

Figure 7.6: Images from two different eyes. Though they look similar, closer look will reveal the differences. For example, notice the difference in distance between the helmet of the man in foreground from the man over his shoulder in the two images.

Binocular cues depend on the images from both the eyes. Our eyes are placed about 6 cm apart and hence they

7.3. BINOCULAR CUES 101

get a different view of the objects in the environment that appear in both the eyes. Since we have frontal eyes, the overlap of the view from two eyes is significant but the viewpoint is different. This difference is called binocular disparity and is converted to depth information. The information provided by the disparity is called stereopsis. The image of the same scene from the different viewpoints of the two eyes are shown in Figure 7.6. This revelation lead to design of several systems where depth perception can be generated. Such displays were called stereo displays. One popular approach was taken in cinema projection where differently polarized light was used to project the images from two different viewpoints on the same screen. The users were provided with glasses. The left glass allowed one direction of polarization and the right glass allowed the other. Thus, though the users were watching the same screen, two eyes received two different images. The right image was blocked by the left glass and vice versa. This created a compelling sense of depth. Similar technologies were used in head-mounted displays where a head gear with two micro-screens, one in front of each eye, is mounted on the head of the user. The images of scenes are generated from two different viewpoints and projected on the two different screens to create a compelling sense of depth.

7.3.1 Binocular Disparity

Figure 7.7: Left:Corresponding Points. Right: Different cues working together to create a vivid sensation of depth.

Figure 7.8: Left: The horopter. Middle: Points on the horopter are imaged at corresponding points of the two eyes. Right: The depth of the points not on the horopter can be estimated by their disparity.

Now, we will see what exactly is binocular disparity. Corresponding retinal points are the locations in each retina which connect to the same place in the visual cortex. We can determine these points by locating the matching

7.5. VISUAL ILLUSIONS 103

Figure 7.9: Left: The Experimental SetUp. Right: User response with changing depth and size of the test.

size was changed with depth to keep the retinal image same. The right image of Figure 7.9 plots the perceived size of the test circle with its distance from the viewer. The slanted dashed line shows the actual change in size of the test circle, and the horizontal dashed line shows the size of the retinal image which is constant. Line 1 shows the subjects perception when all the depth cues were present. This shows that though the retinal size is same, the user can make use of depth cues to perceive a size that is close to reality. Line 2 shows the results when one of the subject’s eyes was closed, i.e. when binocular cue was absent. Note that the accuracy reduces. Line 3 shows the perceived size when the subject was looking through a peephole so that all monocular cues are absent. Line 4 shows the response when the walls were covered with drapes so that there are no shading cues. This illustrates that as the number of depth cues reduces our perception of size starts to fail and reaches closer to the retinal size. Another important aspect of size perception is by using relative size. For example, if we see a photo of a canyon, it is difficult to see how big it is. As soon as we put a human in the photo, we can easily perceive its size. This is because we know the size of a human and this knowledge helps us to perceive the size of the canyon.

7.5 Visual Illusions

Size perception affects our veridical experience and works well in well-lighted natural environments. However, if conditions are adverse, some pathological cases can be designed where our size perception can falter. In this section, we introduce some such visual illusions.

Figure 7.10: Left: The 2D Muller-Lyer Illusion. Right: Inside and outside corner explanation of Muller-Lyer Illusion

104 CHAPTER 7. PERCEIVING DEPTH AND SIZE

The left image of Figure 7.10 shows one example. Note that the vertical line of the left figure looks shorter than that of the right figure though if you measure them with scale they are the same. The right image Figure 7. is one possible explanation. These two lines represent to us an outside and inside corner respectively. Since the depth of the inside corner is more, we tend to perceive the line as longer, following Equation 7.1.

Figure 7.11: Left: The 3D Muller-Lyer Illusion. Right: Another Muller-Lyer Illusion.

Figure 7.12: Left: The 3D Muller-Lyer Illusion. Right: Another Muller-Lyer Illusion.

However, note that this explanation cannot explain the 3D Muller-Lyer illusion (Figure 7.11) that can be easily constructed by a couple of books. Here even though these are not perceived as corner, we perceive the space between the left one to be smaller than the right one, though when measured they are the same. The second explanation that seems more plausible is that of conflicting cues. This says that our perception of the length of the lines depend not only on the actual length of the figures but also on the overall length of the figure. If the overall length of the figure is more, we tend to perceive the line as longer. This is further strengthened by another example of this illusion in Figure 7.11. The distance between the two points in the right image are same for both the (a) and (b). But we perceive one as longer than the other. Figure 7.12 shows the Ponzo illusion. Though the two white lines have same size, the one at the top appears longer. This is explained by the fact that the perspective projection provides us with a strong cue for depth which tells us the upper line is further away than the bottom one. Hence, we perceive a longer line, as per Equation 7.1. The second illusion in Figure 7.12 is that of Ames room. Note how the two people look so different in size. Actually they are similar in height. The Ames room was designed as shown in the right image of Figure 7. so that one corner had a high wall and another corner has lower walls. From a particular angle of the camera, the image of this room can be such that two people seem to differ in height immensely. Our relative size cue aggravates the situation. We expect walls to be high. When one person’s head is close to ceiling, she appears taller than the other whose head is far down from the ceiling.