












































































Besser lernen dank der zahlreichen Ressourcen auf Docsity
Heimse Punkte ein, indem du anderen Studierenden hilfst oder erwirb Punkte mit einem Premium-Abo
Prüfungen vorbereiten
Besser lernen dank der zahlreichen Ressourcen auf Docsity
Download-Punkte bekommen.
Heimse Punkte ein, indem du anderen Studierenden hilfst oder erwirb Punkte mit einem Premium-Abo
Community
Finde heraus, welche laut den Docsity-Nutzern die besten Unis deines Landes sind
Kostenlose Leitfäden
Lade unsere Leitfäden mit Lernmethoden, Hilfen zur Angstbewältigung und von Docsity-Tutoren erstellte Tipps zum Verfassen von Haus- und Abschlussarbeiten kostenlos herunter
Größere Nachbarschaft: verschiedene Verarbeitungsfunktionen werden oft als. Masken (Templates, Windows, Filter) bezeichnet. Ein kleines Array (e.g. 3×3. Pixel) ...
Art: Übungen
1 / 84
Diese Seite wird in der Vorschau nicht angezeigt
Lass dir nichts Wichtiges entgehen!
Abstract The basis of this material are course notes compiled by students which have been corrected and adapted by the lecturer. The students’ help is gratefully acknowledged. Some of the images are taken from websites without prior authoriza- tion. As the document is only used for non-commercial purposes (teaching students) we hope that nobody will be offended!
77 Beispiel Dynamic Programming: (a) Kantenbild (b) Graph mit Kosten (c) m¨ogliche Pfade nach E, A-E ist optimal (d) optimale
There are many books about the subject so here are some examples...
The most important Journals for publication on the subject are...
Three of the most important conferences on image processing...
1.2.1 Digital Image Processing
A scalar function may be sufficient to describe a monochromatic image, while vector functions are to represent, for example, color images consisting of three component colors.
Figure 1: source image
Instead of an signal function a two-dimensional image matrix is used. Each element of the image matrix represents one pixel (picture element) with its position uniquely identified by the row- and column-index. The value of the matrix element represents the brightness value of the corresponding pixel within a discrete range.
Figure 2: image as 2D-set of brightness values
Figure 3: source image
1.3.1 Color Representation
In the RGB color modell (see figure 4) the luminance value is encoded in each color channel while in the YUV color modell (see figure 5) the luminance is only encoded in the Y channel.
Y ≈ 0 .5Green + 0.3Red + 0.2Blue
red channel green channel blue channel Figure 4: RGB decomposition of figure 3
luminance channel (Y) red-green balance (U=B-Y) yellow-blue balance (V=R-Y) Figure 5: YUV decomposition of figure 3
Palette Images (i.e. “Malen nach Zahlen”) include a lookuptable where an index identifies a certain color in unique way. Pixels do not carry luminance or
chessboard distance D 8 ((i, j), (h, k)) = max{|i − h|, |j − k|}
Pixel adjacency is another important concept in digital images. You can either have a 4-neighborhood or a 8-neighborhood as depicted in figure 8.
Figure 8: pixel neighborhoods
It will become necessary to consider important sets consisting of several adjacent pixels. So we define a regions as a contiguous set (of adjacent pixels). There exist various contiguity paradoxes of the square grid as shown in fig- ure 9.
digital line closed curve paradox Figure 9: contiguity paradoxes
One possible solution to contiguity paradoxes is to treat objects using 4-neighborhood and background using 8-neighborhood (or vice versa). A hexagonal grid (as de- picted in figure 10) solves many problems of the square grids. Any point in the hexagonal raster has the same distance to all its six neighbors.
square grid hexagonal grid Figure 10: grid types
Border R is the set of pixels within the region that have one or more neighbors outside R. We distinguish between inner and outer borders. An edge is a local property of a pixel and its immediate neighborhood – it is a vector given by a magnitude and direction. The edge direction is perpendicular to the gradient direction which points in the direction of image function growth.
Four crack edges are attached to each pixel, which are defined by its relation to its 4-neighbors as depicted in figure 11. The direction of the crack edge is that of increasing brightness, and is a multiple of 90 degrees, while its magnitude is the absolute difference between the brightness of the relevant pair of pixels.
pixel crack edge
Figure 11: crack edges
The border is a global concept related to a region, while edge expresses local properties of an image function.
1.3.4 Histograms
Figure 12: image histogram for figure 3
Brightness histogram provides the frequency of the brightness value z in an image. Figure 12 shows the brightness histogram of the image in figure 3. Histograms lack the positional aspect as depicted in figure 13.
Figure 13: images with same histogram
iconic images consists of images containing original data; integer matrices with data about pixel brightness. E.g., outputs of pre-processing operations (e.g., filtration or edge sharpen- ing) used for highlighting some aspects of the image important for further treatment.
0
3 2 1 4 5 6 7
Figure 14: chain data structure
If local information is needed from the chain code, it is necessary to search through the whole chain systematically.
If global information is needed, the situation is much more difficult. For example questions about the shape of the border represented by chain codes are not trivial. Chains can be represented using static data structures (e.g., one-dimensional arrays). Their size is the longest length of the chain expected. Dynamic data structures are more advantageous to save memory.
1.5.3 Run length coding
not covered here.
1.5.4 Topological Data Structures
Topological data structures describe images as a set of elements and their relations. This can be expressed in graphs (evaluated graphs, region adjacency graphs). For a region adjacency graph as an example of a topological data structure see figure 15.
0 1 2
4 3 5
0 (^1 ) 3 4 5
Figure 15: region adjacency graph
1.5.5 Relational Structures
Information is concentrated in relations between semantically important parts of the image: objects, that are the result of segmentation. For an example see figure 16. This type of data structure is especially appropriate for higher level image un- derstanding.
4 5
(^12)
3
50 100 150 200 250
50 100 150 200 250 300 350 400 450 no. object name colour min. row min. col. inside 1 sun yellow 30 30 2 2 sky blue 0 0 – 3 ball orange 210 80 4 4 hill green 140 0 – 5 lake blue 225 275 4 Figure 16: relational data structure
Computer vision is by its nature very computationally expensive, if for no other reason than the amount of data to be processed. One of the solutions is using parallel computers (brute force). Many computer vision problems are difficult to divide among processors, or decompose in any way. Hierarchical data structures make it possible to use algorithms which decide a strategy for processing on the basis of relatively small quantities of data. They work at the finest resolution only with those parts of the image for which it is necessary, using knowledge instead of brute force to ease and speed up the processing. Two typical structures are pyramids and quadtrees. Problems associated with hierarchical image representation are:
Level 0
Level 1
Level 2
Figure 18: T-pyramid data structure
0
10 11
13 120 121 122 123
2 3
0 2 3
10 11 13
120 121 122 123
Figure 19: quadtree
Die menschliche Wahrnehmung entspricht dem Human Visual System (HVS).
Definition 1 (Kontrast) entspricht der lokale Ver¨anderung der Helligkeit, definiert als Quotient zwischen dem Durchschnitt der Objekthelligkeit und der Hinter- grundhelligkeit.
Der Kontrast ist eine logarithmische Eigenschaft (d.h.: keine Ubereinstimmung¨ von Numerik und Wahrnehmung; bei h¨oherer Helligkeit braucht man f¨ur gle- ichen Eindruck h¨oheren Kontrast). Die Sehsch¨arfe (acuity) ist am besten gegen¨uber mittleren Wechseln in der Helligkeit; weniger gut bei zu schnellen oder zu langsamen Wechseln und inter- subjektiv verschieden → “Multiresolution Wahrnehmung”
Ziel des Image Enhancements ist es, Bilder vorzuverarbeiten, damit sie f¨ur spezielle Applikationen besser geeignet sind. Man unterscheidet dabei zwischen:
Es sei f (x, y) die Bildfunktion des Ausgangsbilds und g(x, y) = T (f (x, y)) das verbesserte Bild. T steht f¨ur eine Operation auf f (x, y) in einer Nachbarschaft^2 von (x, y). Einfachster Fall: 1 × 1 Nachbarschaft, d.h. g h¨angt nur vom Wert von f an der Stelle (x, y) ab. T heisst dann Grauwerttransformation (siehe figure 20) mit der Form s = T (v) mit v = f (x, y) und s = g(x, y)
dark light
dark
light
hervorgehobener Bereich Transferfunktion (mehr Kontrast)
dark light
dark
light
Transformation^ binäre
Figure 20: Grauwerttransformation
In der Graphik bezeichnet die x-Achse die urspr¨unglichen Werte, die y-Achse die Werte nach der Transformation. Gr¨oßere Nachbarschaft: verschiedene Verarbeitungsfunktionen werden oft als Masken (Templates, Windows, Filter) bezeichnet. Ein kleines Array (e.g. 3× 3 Pixel) wird ¨uber das Bild bewegt. Koeffizienten des Arrays sind so gew¨ahlt, dass gewisse Bildeigenschaften hervorgehoben werden. Beispiel: Bild konstanter Intensit¨at, mit isolierten Punkten anderer Intensit¨at (“Pop Noise”); Maske: wi = − 1 i = 1,... 9 ausser w 5 = 8, jedes Feld wird mit darunterliegenden Pixeln multipliziert und die Ergebnisse addiert. Wenn alle Pixel gleich sind ist das Ergebnis 0. Die Maske wird Pixel f¨ur Pixel ¨uber das Bild geschoben (siehe figure 21).
Ergebnis
= 0 alle Pixel identisch
0 Mittelpixel hat h¨oheren Wert < 0 Mittelpixel hat geringeren Wert
T [f (x, y)] = w 1 f (x− 1 , y−1)+w 2 f (x− 1 , y)+w 3 f (x− 1 , y+1)+.. .+w 9 f (x+1, y+1) (^2) i.A. eine quadratische Bildregion mit Zentrum in (x, y). Das Zentrum wird von Pixel zu Pixel bewegt.