intTypePromotion=1
zunia.vn Tuyển sinh 2024 dành cho Gen-Z zunia.vn zunia.vn
ADSENSE

Cơ sở dữ liệu hình ảnh - Chương 6

Chia sẻ: Hoangtuan Hoangtuan | Ngày: | Loại File: DOC | Số trang:18

90
lượt xem
7
download
 
  Download Vui lòng tải xuống để xem tài liệu đầy đủ

6. FINDING BASIC SHAPES 6.1 Combining Edges Bits of edges, even when they have been joined up in some way by using, for example, crack edge relaxation, are not very useful in themself unless they are used to enhance a previous image. From

Chủ đề:
Lưu

Nội dung Text: Cơ sở dữ liệu hình ảnh - Chương 6

  1. 6. FINDING BASIC SHAPES 6.1 Combining Edges Bits of edges, even when they have been joined up in some way by using, for example, crack edge relaxation, are not very useful in themself unless they are used to enhance a previous image. From identification point of view it is more useful to determine structure of lines, equations, lengths, thickness... There are a variety of edge-combining methods in literature. These include edge following and Hough transforms. 6.2 Hough Transform This technique allows to discover shapes from image edges. It assumes that a primitive edge detection has already been performed on an image. It attempts to combine edges into lines, where a sequence of edge pixels in a line indicates that a real edge exists. As well as detecting straight lines, versions of the Hough transform can be used to detect regular or non-regular shapes, though, as will be seen, the most generalized Hough transform, which will detect a two dimensional specific shape of any size or orientation, requires a lot of processing power in order to be able to do its work in a reasonably finite time. 6.2.1 Basic principle of the straight-line Hough transform After primitive edge detection and then thresholding to keep only pixels with a strong edge gradient, the scree n may look like Figure 6.1. Figure 6.1 Screen after primitive edge detection and thresholding (only significant edge pixel shown). A straight line connecting a sequence of pixels can be expressed in the form: y = mx + c If we can evaluate values for m and c such that the line passes through a number of the pixels that are set, then we have a usable representation of a straight line. The Hough transform takes the above image and converts into a new image (what is termed) in a new space. In fact, it transforms each significant edge pixel in (x,y) space into a straight line in this new space.
  2. Original data Line to be found 1 2 3 4 Figure 6.2 Original data. Clearly, many lines go through a single point ( x, y), e.g. a horizontal line can be draw through the point, a vertical line, and all the lines at different angles between these. However, each line will have a slope (m) and intercept (c) such that the above equation holds true. A little manipulation of the above equation gives: c = (−x)m + y y x Gives Transposed c = −1m + 3 3 1 3=m.1+c c = −2m + 3 2 2 2=m.2+c c = −3m + 3 3 4 3=m.4+c c = −4m + 3 0 4 0=m.4+c c Three line coincide here 3 3 m 0 c = − m+3 1 c = − m+2 2 c = − m c = − m+3 4 4 Figure 6.3. Accumulator array in (m,c) space. Maximum in the accumulator array is 3 at (−1,4), suggesting that a line y = −1x + 4 goes through three of the original data points. We know the value of x and y (the position where the pixel may be on an edge), but in this form. the equation now represents a straight line in ( m,c) space, i.e. with a horizontal m-axis
  3. and a vertical c-axis, each (x,y) edge pixel corresponds to a straight line on this new ( m,c) graph. We need space to be available to hold this set of lines in an array (called the accumulator array). Then for every (x,y) point, each element that lies on the corresponding line in the ( m,c) accumulator array can be incremented. So that after the first point in the ( x, y) space has been processed, there will be a line of 1st in the (m,c) array. This plotting in the (m, c) array is done using an enhanced form of Bresenham’s algorithm, which will plot a wide, straight line (so that at the ends crossing lines are not missed). At the end of processing all the (x,y) pixels, the highest value in the (m,c) accumulator array indicates that a large number of lines cross in that array at some points ( m’,c’). The value in this element corresponds to the same number of pixels being in the straight line in the ( x,y) space and the position of this element gives the equation of the line in the ( x,y) space, and the position of this element gives the equation of the line in (x,y) space: y = m’x + c’ 6.2.2 Problems There are serious problems in using ( m,c) space. For each pixel, m may properly vary from minus infinity to infinity (i.e. straight line upwards). Clearly this is unsatisfactory: no accumulator array can be set up with enough elements. There are alternatives, such as using two accumulator array, with m ranging from −1≤ m ≤ +1 in one and −1≤ 1/m ≤ +1 in the second. It is safer, though requiring more calculation, to use angles, transforming to polar coordinates (r,θ), where xcosθ + ysinθ = r. Point(x,y) y=a1x+b1 y=a2x+b2 y=a5x+b5 y=a3x+b3 y=a4x+b4 Figure 6.4 Family of lines (Cartesian coordinates) through the point (x,y). y (x,y) r θ x One of many possible Shotest distance from lines through (x,y), origin to line defines the e.g. y=ax+b line in term of r and θ
  4. y (x,y) y-x tanθ x/cosθ (y-x tanθ θ )sin xtanθ x x + ( y − x tan θ ) sin θ r= cosθ sin 2 θ x + y sin θ − x = cosθ cosθ  1 − sin θ  2 = x  cosθ  + y sin θ = x cosθ + y sin θ    Figure 6.5 Relationship between Cartesian straight line and polar defined line. Technique 6.1. Real straight-edge discovery using the Hough transform. USE. This technique is used to find out and connect substantial straight edges already found using and edge detector. OPERATION. For each edge pixel value I( x,y), vary θ from 0o to 360o and calculate r = xcosθ + ysinθ . Given an accumulator array size (N+M,360), increment those elements in the array that lie in box (b x b) with center ( r, θ ). Clearly if the box is (1x1), only one element of the array is incremented; if the box is 3 x 3, nine elements are incremented. This gives a "thick" line in the new space so that intersections are not missed. Finally, look for the highest values in the accumulator arrays ( r,θ) and thus identify the pair (r, θ ) that are most likely to indicate a line in (x,y) space. This method can be enhanced in a number of ways: 1. Instead of just incrementing the cells in the accumulator array, the gradient of the edges, prior to thresholding, could be added to the cell, thus plotting a measure of the likelihood of this being an edge. 2. Gradient direction can be taken into account. If this suggest s that the direction of the real edge lies between two angles θ1 and θ2 , then only the elements in the (r, θ ) array that lies in θ1 < θ < θ 2 that are plotted. 3. The incrementing box does not need to be uniform. It is known that the best estimate of (r, θ ) is at the center of the box, so this element is incremented by a large figure than the elements around that center element. Note that the line length is not given, so that the lines go to infinity as it stands. Three approaches may be considered:
  5. 1. Pass 3 x 3 median filter over the image original and subtracting the value of the center pixel in the window from the result. This tends to find some corners of images, thus enabling line endings to be estimated. 2. Set up four further accumulator array. This first pair can hold the most north-east position on the line and the second pair the most south-west position, these positions being updated as and when a pixel contributes to the corresponding accumulating element in the main array. 3. Again with four further accumulator array, let the main accumulator array be increased by w for some pixel (x,y). Increase this first pair by wx and wy and the second by (wx)2 and (wy)2. At the end of the operation a good estimate of the line is: mean of lines ± 2σ where σ is the standard deviation, i.e. 2 (  wx ) wx 2 estimate = ∑ ∑− ∑ wx  ± End of line  ∑  ∑ ∑ w w w for the x range and the similar expression for the y range. This makes some big assumption regarding the distribution of edge pixels, e.g. it assumes that the distribution is not skewed to one end of the line, and also many not always be appropriate. The Hough technique is good for finding straight lines. It is even better for finding circles. Again the algorithm requires significant edge pixels to be identified so some edge detector must be passed over the original image before it is transformed using the Hough technique. Technique 6.2. Real circle discovery using the Hough transform. USE. Finding circles from an edge-detected image. OPERATION. If the object is to search for circles of a known radius R, say, then the following identity can be used: ( x − a ) 2 + ( y − b) 2 = R 2 where (a,b) is the centre of the circle. Again in ( x,y) space all pixels or, an edge are identified (by thresholding) or every pixel with I(x,y) > 0 is processed. A circle of elements is incremented in the ( a,b) accumulator array centre (0
  6. Circle to be found Figure 6.6. Original data in (x,y) domain. Again it is possible to reduce the amount of work by using the gradient direction to indicate the likely arc within which the circle centre is expected to lie. Figure 6.7 illustrates this technique. It is possible to look for the following types of circles: different radii plot in (a,b,R) space different radii, same vertical centres plot in (b,R) space different radii, same horizontal centres plot in (a,R) space Four cicles coincide here only Figure 6.7 Illustration of Hough circle transform (looking for circles radius 1/√2). Corresponding accumulator circles in (a,b) domain. If the circle radius is known to be one of three values, say, then ( a,b,R) space can be three planes of (a,b) arrays. The following points are important: 1. As the number of unknown parameters increases, the amount of processing increases exponentially. 2. The Hough technique above can be used to discover any edge that can be expressed as a simple identity. 3. The generalized Hough transform can also be used to discover shapes that can not be represented by simple mathematical identities. This is described below. Technique 6.3. The generalized Hough transform.
  7. USE. Find a known shape in its most general form-of any size or orientation in an image. In practice it is best to go for a known size and orientation. OPERATION. Some preparation is needed prior to the analysis of the image. Given the object boundary, and assuming that the object in the image is of the same size and orientation (otherwise a number of accumulator arrays have to beset up for different sizes and orientations), a ‘centre’ ( x,y) is chosen somewhere within the boundary of the object. The boundary is then traversed and after every step d alone the boundary the angle of the boundary tangent with respect to horizontal is noted, and the x difference and y difference of the boundary position from the centre point are also noted. For every pixel I(x, y) in the edge-detected image, the gradient direction is found. The accumulator array (same size as the image) is then incremented by 1 for each such element. Finally, the highest-valued elements in the accumulator array point to the possible centres of the object in the image. 6.3 Bresenham’s Algorithm Bresenham’s line algorithm is an efficient method for scan-converting straight lines in that it uses only integer addition, subtraction, and multiplication by 2. As a very well known fact, the computer can perform the operations of integer addition and subtraction very rapidly. The computer is also time-efficient when performing integer multiplication and division by powers of 2. The algorithm described in the following is a modified version of the Bresenham algorithm. It is commonly referred to as the midpoint line algorithm. U yk+1 d2 y d1 D yk xk xk + 1 Figure 6.8 Midpoint algorithm The equation of a straight line in 2-dimensional space can be written in an implicit form  as F(x, y) = ax + by + c = 0
  8. From the slope-intercept form dy y= x+B dx we can bring it to the implicit form as dy ⋅ x − dx ⋅ y + Bdx = 0 b = −dx, So a = dy, c = Bdx Suppose that point (xi, yi) has been plotted. We move xi to xi + 1. The problem is to  select between two pixels, U(xi + 1, yi + 1) and D(xi + 1, yi). For this purpose, we consider the middle pixel M(xi + 1, yi + 1 ). We have 2 d = F(M) = a(xi + 1) + b( yi + 1 ) + c 2 If d>0 , choose U d
  9. F(M) = a(x1 + 1) + b (y1 + 1 ) + c 2 = F(x1, y1 ) + a + b/2 Since F (x1 , y1) = 0, we have d = d1 = dy − dx/2 In order to avoid a division by 2, we use 2 d1 instead. Afterward, 2d is used. So, with d used in place of 2d, we have First set d1 = 2dy − dx If di ≥ 0 then xi+1 = xi + 1, yi+1 = yi + 1 and di+1 = di + 2 (dy − dx) If di < 0 then xi+1 = xi + 1, yi+1 = yi di+1 = di + 2dy The algorithm can be summarized as follows: Midpoint Line Algorithm [Scan-convert the line between (x1, y1) and (x2, y2)] dx = x2 − x1; dy = y2 − y1; d = 2*dy − dx; /* initial value of d */ dD = 2*dy; /* increment used to move D */ dU = 2*(dy − dx); /* increment used to move U */ x = x1; y = y1 ; Plot Point (x, y); /* the first pixel */ While (x < x1) if d
  10. Example. Scan-convert the line between (5, 8) and (9, 11). Since for the points, x < y, consequently the algorithm can apply. Here dy = 11 − 8 = 3, dx = 9 −5 = 4 First d1 = 2dy − dx = 6 − 4 = 2 > 0 So the new point is (6, 9) and d2 = d1 + 2 (dy − dx) = 2 + 2(−1) = 0 ⇒ the chosen pixel is (7, 10) and d3 = d2 + 2 (dy − dx) = 0 +2(−1) = −2 < 0 the chosen pixel is (8, 10), then d4 = d3 + 2dy = −1 +6 = 5 > 0 The chosen pixel is (9, 11). 6.3.2 Circle incrementation A circle is a symmetrical figure. Any circle-generating algorithm can take advantage of the circle’s symmetry to plot eight points for each value that the algorithm calculates. Eight-way symmetry is used by reflecting each calculated point around each 45 ° axis. For example, if point 1 in Figure 6.9 were calculated with a circle algorithm, seven more points could be found by reflection. The reflection is accomplished by reversing the x, y coordinates as in point 2, reversing the x, y coordinates and reflecting about the y axis as in point 3, reflecting about the y
  11. y (-2, 8) (2, 8) (-y, x) 9 y (y, x) z (8, 2) (-8, 2) x (x, y) (-x, y) { x 9 | (x, -y)  (-x, -y) (8, -2) (-8, -2) } ~ (y, -x) (-y, -x) (2, -8) (-2, -8) Figure 6.9 Eight-way symmetry of a circle. axis as in point 4, switching the signs of x and y as in point 5, reversing the x, y coordinates, reflecting about the y axis and reflecting about the x axis as in point 6, reversing the x, y coordinates and reflecting about the y axis as in point 7, and reflecting about the x axis as in point 8. To summarize: P5 = (−y, −x) P1 = (x, y) P1 = (−y, −x) P2 = (y, x) P3 = (−y, x) P7 = (y, −x) P4 = (−x, y) P8 = (x, −y) (i) Defining a Circle There are two standard methods of mathematically defining a circle centered at the origin. The first method defines a circle with the second-order polynomial equation (see Figure 6.10). y2 = r2 − x2 where x = the x coordinate y= the y coordinate r= the circle radius With this method, each x coordinate in the sector, from 90 to 45 °, is found by stepping x from 0 to r / 2 , and each y coordinate is found by evaluating r 2 − x 2 for each step of x. This is a very inefficient method, however, because for each point both x and r must be squared and subtracted from each other; then the square root of the result must be found.
  12. The second method of defining a circle makes use of trigonometric functions (see Figure 6.11): y y P = ( x, r 2 − x 2 ) P=(r cos θ, r sin θ) r sin θ y r θ x x r cos θ x Fig. 6.10 Circle defined with a second- Fig. 6.11 Circle defined with trigonometric degree polynomial equation. functions. x = r cosθ y = r sinθ where θ = current angle r = circle radius x = x coordinate y = y coordinate By this method, θ is stepped from θ to π / 4, and each value of x and y is calculated. However, computation of the values of sin θ and cosθ is even more time-consuming than the calculations required by the first method. (ii) Bresenham’s Circle Algorithm If a circle is to be plotted efficiently, the use of trigonometric and power functions must be avoided. And as with the generation of a straight line, it is also desirable to perform the calculations necessary to find the scan-converted points with only integer addition, subtraction, and multiplication by powers of 2. Bresenham’s circle algorithm allows these goals to be met. Scan-converting a circle using Bresenham’s algorithm works are follows. If the eight- way symmetry of a circle is used to generate a circle, points will only have to be generated through a 45° angle. And, if points are generated from 90 to 45 °, moves will be made only in the +x and -y directions (see Figure 6.12).
  13. -y 45° +x Figure 6.12 Circle scan-converted with Bresenham’s algorithm. The best approximation of the true circle will be described by those pixels in the raster that fall the least distance from the true circle. Examine Figures 6.13( a) and 6.13(b). Notice that if points are generated from 90 and 45 °, each new point closest to the true circle can be found by taking either of two actions: (1) move in the x direction one unit or (2) move in the x direction one unit and move in the negative y direction one unit. Therefore, a method of selecting between these two choices is all that is necessary to find the points closest to the true circle. Due to the 8-way symmetry, we need to concentrate only on the are from (0, r) to (r / 2 , r / 2 ) . Here we assume r to be an integer. Suppose that P(xi, yi) has been selected as closest to the circle. The choice of the next pixel is between U and D (Fig.2.8). x2 + y2 - r2. We know that  Let F(x, y) = F(x, y) = 0 then (x, y) lies on the circle >0 then (x, y) is outside the circle
  14. dU = dnew − dold = 2xi + 3 If dold ≥ 0, M is outside the circle and D is chosen. The new midpoint will be one * increment over x and one increment down in y: 3 dnew = F (xi + 2, yi − 2 ) = dold + 2xi − 2yi + 5 The increment in d is therefore dD = dnew − dold = 2(xi − yi ) + 5 Since the increments dU and dD are functions of (xi , yi), we call point P(xi, yi) the point of evaluation. 1  Initial point : (0, r). The next midpoint lies at (1, r- ) and so 2 F(1, r − 1 ) = 1 + (r − 1 )2 − r2 = 5 4 − r 2 2 To avoid the fractional initialization of d, we take h = d − 1 4 . So the initials value of h is 1 − r and the comparison d < 0 becomes h < − 1 4 . However, since h starts out with an integer value and is incremented with integer values ( dU and dD), we can change the comparison to h < 0. Thus we have an integer algorithm in terms of h. It is summarized as follows: (0, r) U(xi + 1, yi ) P(xi, yi) (r / 2, r / 2) M× O D(xi +1, yi - 1) (b) (a) Figure 6.13 Bresenham’s Circle Algorithm (Midpoint algorithm) Bresenham Midpoint Circle Algorithm h = 1 − r ; /*initialization */ x = 0; y = r; Plot Point (x, y); While y > x if h < 0 then /* Select U */ dU = 2*x + 3;
  15. h = h + dU; x = x + 1; else /* Select D */ dD = 2*(x − y) + 5; h = h − dD; x = x + 1; y = y − 1; endif End While (iii) Second-order differences If U is chosen in the current iteration, the point of evaluation moves from ( xi, yi ) to  (xi+1, yi ). The first-order difference has been calculated as dU = 2xi + 3 At point (xi + 1, yi ), this will be d′U = 2( xi + 1) + 3 . Thus the second-order difference is ′ ∆U = d U − d U = 2 Similarly, dD at (xi, yi ) is 2(xi − yi )+5 and at (xi +1, yi ) is d ′ = 2(xi +1− yi ) + 5. Thus the D second-order difference is ∆D = d ′ − d D = 2 D If D is chosen in the current iteration, the point of evaluation moves from ( xi, yi ) to (xi  +1, yi -1). The first-order differences are d D = 2(xi − yi ) + 5 ′ d D = 2 [ xi + 1 − ( yi − 1)] + 5 = 2( xi − yi ) + 4 + 5 d U = 2 xi + 3 d U ′ = 2( xi + 1) + 3 Thus the second-order differences are ∆U = 2, ∆D = 4 So the revised algorithm using the second-order differences is as follows: h = 1 − r, x = 0 , y = r , ∆ U = 3, ∆ D = 5 − 2r, plot point (x, y) (1) (initial point) (2) Test if the condition y = x is reached. It not then (3) If h < 0 : select U
  16. x = x+1 h + ∆U h = ∆U = ∆U + 2 ∆D = ∆D + 2 else : select D x = x+1 y −1 y = h + ∆D h = ∆U = ∆U + 2 ∆D = ∆D + 4 end if plot point (x, y) 6.4 Using interest point The previous chapter described how interest points might be discovered from an image. From these, it is possible to determine whether the object being viewed is a “known” object. Here the two-dimensional problem, without occlusion (objects being covered up by other objects), is considered. Assume that the interest points from the known two dimensional shape are held on file in some way and that the two-dimensional shape to be identified has been processed by the same interest points that now have to be compared with a known shape. We further assume that the shape may be have been related, scaled, and/or translated from the original known shape. Hence it is necessary to determine a matrix that satisfies: discovered interest point = known shape interest point × M or D = KM where M is two-dimensional transformation matrix of the form a 0 b   c d 0 e 1 f   and the interest point sets are of the form  x1 1 y1    x2 y2 1  ... ...  ...   x 1 yn n  The matrix M described above does not allow for sheering transformations because this is essentially a three-dimensional transformation of an original shape. There is usually some error in the calculations of interest point positions so that D=KM+ε
  17. and the purpose is to find M with the largest error and then determine whether that error is small enough to indicate that the match is correct or not. A good approach is to use a least- squares approximation to determine M and the errors, i.e. minimize F(D-KM) where F(Z) = x12 + y12 This gives the following normal equations: ∑ x2 ∑ xy ∑ x   a   ∑ xX       ∑ xy ∑ y ∑ y  ×  c  =  ∑ yX  or La = s1 2  ∑ y n  e  ∑ X   ∑x    and ∑ x2 ∑ xy ∑ x   b   ∑ xY      ∑ xy ∑ y ∑ y  ×  d  =  ∑ yY  or Lb = s 2 2  ∑ y n   f   ∑Y   ∑x   If the inverse of the square L matrix is calculated, then the values for a to f can be evaluated and the error determinated. This is calculate as L-1L a = L-1 s1 and L-1L b = L-1 s2 Resulting in a = L-1s1 and b = L-1s2. 6.5 Problems There are some problems with interest point. First, coordinates must be paired beforehand. That is, there are known library coordinates, each of which must correspond to correct unknown coordinate for a match to occur. This can be done by extensive searching, i.e. by matching each known coordinate with each captured coordinate, all possible permutations have to be considered. For example, consider an interest point algorithm that delivers five interest points for a known objects. Also let there be N images, each containing an unknown object, the purpose of the exercise being to identify if any or all of the images contain the known object. A reduction on the search can be done by eliminating all those images that do not have five interest points. If this leaves n images there will be b x 5! = 120n possible permutations to search. One search reduction method is to order the interest points. The interest operator itself may give a value which can place that interest point at a particular position in the list. Alternatively, a simple sum of the brightness of the surrounding pixels can be used to give a position. Either way, if the order is known, the searches are reduced from 0(n x i!) to 0(n), where i is the number of interest points in the image. The second problem is that the system cannot deal with occlusion or part views of objects, nor can it deal with three-dimensional objects in different orientations. 6.6 Exercises
  18. 6.6.1 Using standard graph paper, perform a straight line Hough transform on the binary pixels array shown in the following figure transforming into (m,c) space. Figure 6.8 Binary array 6.6.2 A library object has the following ordered interest point classification {(0,0), (3,0), (1,0), (2,4)} Identify, using the above technique, which of the following two sets of interest points represent a transition, rotation, and/or scaling of the above object: {(1,1), (6,12), (2,5), (12,23)} {(1,3), (1,12), (-1,8), (3,6)} Check your answer by showing that a final point maps near to its corresponding known point.
ADSENSE

CÓ THỂ BẠN MUỐN DOWNLOAD

 

Đồng bộ tài khoản
2=>2