Efficiency of Chain Codes To Represent Binary Objects
Efficiency of Chain Codes To Represent Binary Objects
com/locate/pr
Monterrey, NL, C.P. 64849, Mxico Received 26 January 2006; received in revised form 2 October 2006; accepted 6 October 2006
Abstract We present a study of compression efciency for binary objects or bi-level images for different chain-code schemes. Chain-code techniques are used for compression of bi-level images because they preserve information and allow a considerable data reduction. Furthermore, chain codes are the standard input format for numerous shape-analysis algorithms. In this work we apply chain codes to represent object with holes and we compare their compression efciency for seven chain codes. We have also compared all these chain codes with the JBIG standard for bi-level images. 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved.
Keywords: Chain coding; Shapes; Bi-level images; Huffman algorithm; Entropy
1. Introduction The encoding efciency to represent shapes of objects is very important for image storage and transmission. It is also important for shape analysis and shape recognition in pattern recognition. Chain-code techniques are widely used because they preserve information and allow considerable data reduction. The rst approach for representing digital curves using chain code was introduced by Freeman in 1961 [1]. Many authors have reported interesting applications using chain-code, for example: Mckee and Aggarwal [2] have used chain coding in the process of recognizing objects. Various shape features may be computed directly from this representation, contour smoothing and correlation for shape comparison are also relatively simple [3]. Classical methods for
Corresponding author. Tel.: +52 449 9 10 84 22; fax: +52 449 9 10 84 01. E-mail addresses: [email protected] (H. Snchez-Cruz), [email protected] (E. Bribiesca), [email protected] (R.M. Rodrguez-Dagnino).
processing chains are referred to [4]. Other interesting coding schemes that are related to chain code are available in Refs. [59]. The vertex chain code (VCC) was presented in 1999 [10]. The shape of objects can be seen as an image of two tones: black and white (or B/W) or bi-level images, where black typically represents the foreground of the image, and white the background. The importance of this topic has been recognized by international committee and this joint effort has generated a standard image compressor for this kind of images called Joint Bi-level Image Experts Group (JBIG). JBIG has been mostly developed to compress facsimile documents for efcient transmission, efcient storage of B/W documents in libraries, and document distribution. These documents may contain text, line drawings and halftone images. The rst JBIG standard was primarily designed for the compression of B/W images with no loss of information (a lossless compression system). In addition, a second version of this standard, called JBIG2, provides compression of B/W images by allowing certain degree of information loss by having at the same time the advantage of higher
0031-3203/$30.00 2006 Pattern Recognition Society. Published by Elsevier Ltd. All rights reserved. doi:10.1016/j.patcog.2006.10.013
1661
compression rates. A recent description of JBIG standards can be found in Ref. [11]. We will use JBIG standard as a point of reference to value the efciency of some recent chain codes proposed in literature to represent and encode shape of objects. We have selected an important sample of the main chain codes proposed in literature. We compare the following seven chain codes: Paperts turtle or PRT [12], F8 (Freeman chain of eight directions) [1], F4 (Freeman chain of four directions) [4], VCC2 (vertex chain code of two symbols) [10], VCC3 (vertex chain code of three symbols) [10], 3OT (orthogonal chain directions of three symbols) [13], and AF8 (Angle Freeman chain code of eight directions) [14]. Despite the fact that we have included most of the popular chain codes in this paper, other chain codes can be designed to represent different types of bi-level images within the literature. In particular, we will mention the primitives chain code (PCC) that has been designed to represent two-dimensional tree geometries [15]. This chain code considers 255 features that make a PCC codeword an 8-bit long word. This xedlength coding is not fair in comparison to the variable-length Huffman codes considered in this paper. On the other hand, PCC is not considered in this work because it has 255 symbols. Therefore it requires much larger sets of data to estimate the probabilities of each of its symbols. However, according to the comments in Ref. [16] the performance of PCC is expected to be somewhere between F8 and VCC3 of our analysis. Seven different chain codes methods and JBIG are applied to a sample of 35 irregular object-shapes. These objects include letters of the alphabet, shapes of animals and humans, a molecule, a sh manufactured objects, an airplane, and so forth, all of them with one or more holes. The images are composed of different number of pixels ranging from 33 41 to 1081 1039 pixels. 2. The chain codes Freeman [4] states that in general, a coding scheme for line structures must satisfy three objectives: (1) it must faithfully preserve the information of interest; (2) it must permit compact storage and be convenient for display; and (3) it must facilitate any required processing. The three objectives are somewhat in conict with each other, and any code necessarily involves a compromise among them. Most abovementioned chain codes comply with these three objectives. In this section we present the most representative chain codes that have been published in literature. They are characterized for being composed of unit length elements feasible to cover contour shapes represented by resolution cells. 2.1. Paperts turtle One of the simplest contour representations using only a two symbol code, {0, 1}, was proposed by Papert [12]. In this
chain, symbol 0 represents a right turn whereas symbol 1 represents a left turn when following the contour shape. The algorithm is as follows: Scan the picture until a gure cell is encountered. Then: 1. If you are in a front cell turn left and take a step. 2. If you are in a ground cell turn right and take a step. 3. Terminate when you are within the starting point cell. Fig. 1 shows an object-shape, which will be considered composed of resolution cells. Fig. 2 shows part of the shape presented in Fig. 1 coded by PRT. 2.2. The 3OT chain code In 2000, Bribiesca [13] proposed ve different symbols to represent the relative change of orthogonal directions to encode three dimensional curves. A similar approach was followed in Ref. [17] for the representation of 2D binary shapes without holes, where only three of the ve original symbols were used. In this paper, we also use only three relative changes of orthogonal directions given by code symbols or chain elements. We call this code as 3OT, and it is given by the set 3OT = {0, 1, 2}. By using this code we are able to represent any binary region enclosed by a discrete curve. The element 0 represents no direction change which means to go straight through the contiguous straight line segments following the direction of the last segment; the 1 indicates a direction change forward with regard to the basis segment; and 2 means to go back with regard to the direction of the basis segment. See Fig. 3(a) for graphical representation of these symbols. Some properties of this chain code are: invariance under translation, rotation and mirror transformations, and normalization at starting point [13].
1662
v0
0 0 1 1
0 0
0 0 1
0 0 0 0
0 0
0 00 1 1 1 1 0 00 1 1 1 1 0 00 00 00 0 0 1 1 0 11 11 00 0 0 00 0 00 1 0 0 00 00 0 0 0 0 0 0 00 0 0 00 0 0 0 0 0 0 0 00 0 00 0 0 00 0
1 1 1 1
0 0 00 0 0 0 0 0 0 0 0 0 0
1 1 1 1 1 1 1 1 1 1 1 1 1 1 11 1 1 1 1 0 1 0 1 1 0 0 0 11 0 0
0 11 0 11 0 0
3 4 5
1 0
c
2
1 0 3
d
4 1 2
0 1
7 5
Fig. 3. Six of the seven chain codes used in this work to compress bi-level images: (a) 3OT; (b) F8; (c) F4; (d) AF8; (e) VCC3; and (f) VCC2.
In Fig. 4 shape-object pi symbol coded by 3OT is presented. 2.3. Freeman chain codes Chain code methods to represent contour shapes were introduced by Freeman [1] in 1961, see also Ref. [4]. In Freemans approach, an arbitrary curve is represented by a sequence of vectors of unit length and a set of eight possible directions. On the digital grid, encoding is based on the fact that successive contour points are adjacent to each other. Depending on whether the 4-connected or the 8-connected grid is used, then the chain code is dened as the digits from 0 to 3 or 0 to 7, assigned to the 4 or 8 neighboring grid points in a counter-clockwise sense. Of course, it can also be
considered a 6-connected grid, by assigning the digits from 0 to 5. Figs. 3(b) and (c) show 8- and 4-connected Freeman chain code (F8 and F4), respectively. A variant of these codes was proposed by Liu and Zalik [14], where they considered eight relative change directions, and the eight symbols are assigned according to the direction found after following the discrete curve. For instance, a possible symbol assignation is as follows: symbol 0 when change direction is 0 , this is, no direction change occur in the followed direction, 1 when change direction is +45 , 2 represents a change direction of 45 , 3 represents +90 , 4 for 90 , 5 for +135 , 6 for 135 , and 7 for 180 . See Fig. 3(d). We just mentioned these change directions in the order of frequency appeared in 1000 curves analyzed by Liu and Zalik [14], where they found that 0
1663
2 0 2
20 0 1 20 0 1 0 0 0 1 1 1 0 1 1 1 1 0 0 0 1 1 1 1 0 1 0 0 1 1 1 1 0 1 1 1 1 0 0 1 1 0 1 1 1 1 0 0 1 0 1 1 1 1 1 1 1 0 0 1
2 0 2
was the most probable symbol appearing in any curve of their experiments. Fig. 5 shows examples of Freeman and Liu and Zalik codes, encoding pi symbol shape by (a) AF8, (b) F8 and (c) F4 codes, respectively. 2.4. VCC codes In 1999, Bribiesca [10] proposed a chain code based on the number of cell vertices that are in touch within the bounding contour of the shape. In order to compress the chain information of VCC, each element (cell vertex) of the chain was decreased by one. The different types of this code may consist on two sets composed of two or three symbols, depending on whether the cell resolution is a square or a hexagon. If the cell resolution is a square the set of symbols is VCC3 = {0, 1, 2}, where each symbol represents the number of resolution cell vertices that are in touch along the contour shape; see Fig. 3(e). If the cell resolution is an hexagon then the set of symbols is VCC2 = {0, 1}; see Fig. 3(f). Some important characteristics of the VCC are: (1) the VCC is invariant under translation and rotation, and optionally may be invariant under starting point and mirroring transformation; (2) using the VCC, it is possible to represent shapes composed of triangular, rectangular, and hexagonal cells; (3) the chain elements represent real values not symbols, such as other chain codes, that are part of the shape. These elements indicate the number of cell vertices of the contour nodes, and they may be operated for extracting interesting shape properties; and (4) using the VCC, it is possible to obtain relations between the bounding contour and interior of the shape. In Fig. 6 it is shown examples of VCC3 and VCC2 codes. AF8, PRT, VCC3, VCC2, and 3OT are chain codes invariant under rotation transformations, whereas F8 and F4 are
not. Expressed as sets, we can write the seven chain codes as: AF8={0, 1, 2, 3, 4, 5, 6, 7}, F8={0, 1, 2, 3, 4, 5, 6, 7}, F4= {0, 1, 2, 3}, 3OT = {0, 1, 2}, PRT = {0, 1}, VCC2 = {0, 1} and VCC3={0, 1, 2}. Of course, to complete their specication it is necessary to add the assignation rule in each of the cases. 2.5. Some further observations about published works Recently, a work published by two of the authors of this paper [17], the three-bit chain code (3OT in this work), was shown to have a better performance than JBIG standard, when applied to four different examples with no holes. In another paper [18], Huo and Chen derived a multiscale encoder for curves and boundaries called JBEAM. In such a work a compression scheme is applied on nine different contour shapes, each given by 256 256 pixel image resolution. According to their results, the JBEAM method is better than JBIG standard over the considered examples as well. In fact, if we analyze their results (Table 1), we can see that the compression rate gain decreases as the number of pixels increases in contour shapes, see Fig. 7. However, these studies have some limited scope in the sense of the set of studied objects. For instance, after analyzing Fig. 7 and Table 1, it can be suggested the possibility of dening a threshold from which JBIG could be better than JBEAM. Reavy and Boncelet [19] found that compression rate, based on block arithmetic coding for image compression (BACIC) algorithm for bi-level images, are comparable to JBIG compressor. In the work presented recently by Liu and Zalik [14], they consider eight angle changes, based on the eight directions of a Freeman chain code. We name this code as AF8 in this paper. It seems that Lui and Zalik method represents
1664
a more efcient coding with regard to FCCED [1], FCCFD [1], DFC [10] and VCC [10] codes, as it is stated in Ref. [14]. We should mention that Lui and Zalik [14] applied their method to binary objects without holes and they
used Huffman algorithm to compress the AF8 chain. They apply this method over various shapes chosen randomly on the web. However, they did not consider objects having holes in their study.
1665
(JBIG-JBEAM)/JBIG
Table 1 Number of pixels in the contour image, and the comparison of JBIG and JBEAM methods (in bytes) applied to nine different contour shapes (countries borders), after [18] Pixels 658 813 822 845 871 920 1233 1350 2911 JBIG (bytes) 4360 5792 5768 6088 5792 6448 7888 7992 15 224 JBEAM (bytes) 3303 4609 4391 4171 4916 5300 6114 6780 14 555
0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 0 500 1000 1500 2000 2500 3000 3500 Pixels
Fig. 7. Compression rate of JBEAM with regard to JBIG.
Some questions arise when we consider a large set of objects, including objects with more complex shapes and
having one or more holes, and larger number of pixels: (a) Is the efciency in encoding contour shapes maintained for
1666
all these shapes with one or more holes? (b) Is the efciency in encoding contour shapes maintained for all these shapes with larger number of pixels? (c) How far are these seven chain-code methods from JBIG standard? We will address experimentally some of these questions in this work. We state the following three main objectives of this paper: 1. We consider only lossless compression methods. 2. We compare the seven chain codes: F8, F4, AF8, PRT, VCC3, VCC2, and 3OT, by taking as a reference JBIG. 3. We apply the method to shapes having different sizes and objects with holes. We require the following concepts and terminology: Perimeter-8: Depending on the geometry of resolution cells and neighborhood dened, there are different ways to cover the contour of an object-shape. In this work we dene perimeter-8 of a shape as that given by squares as resolution cells and 8-connected neighborhoods. Perimeter-4: This is the perimeter obtained by having squares as resolution cells and 4-connected neighborhoods. Equivalently, perimeter-4 gives the number of sides of square grid that faces a background image. External contour: This is the trajectory performed on the border of a shape object without considering its holes. Length chain: This is the number of symbols a string, obtained by chain code, has. Hole contour: This is the trajectory performed on the borders of each hole that an object has. Let us denote as LF8 , LF4 , LPRT , LAF8 , LVCC2 , LVCC3 , and L3OT , the average number of bits/symbol of a string given by F8, F4, PRT, AF8, VCC2, VCC3, and 3OT chain codes, respectively. The average number of symbols of each code, say SCODE , and the average length of a code in bytes will be denoted as LB CODE . 3. Encoding the objects The method used in this work to encode each object consists on visiting the external contour of each object in clockwise sense, and the contour of holes in counterclockwise sense. See Fig. 8. These directions are very important to recover the shape when interpreting the resulting chain code. All contour holes that an object has will be visited in counterclockwise sense. A 3 3 window grid was used to follow the contour of each object-shape, inspecting the change directions of the discrete contour, and giving the specic code. To implement each of the chain codes, every shape object is kept in a monochromatic bitmap le. This le can be processed to put information of every object in a matrix of zeros and ones (see Fig. 9), then we follow the next procedure: 1. Scan the le, line by line, until a 1 is encountered. 2. Register Cartesian coordinates for this rst 1 character. 3. Visit contour of the shape in clockwise sense and inspect the vicinity with a 33 matrix, giving the correspondent chain code to the contour by using any of the seven codes. 4. Apply Huffman algorithm to compress chain code by obtaining probability of appearance of each symbol. 6. For the holes, repeat steps 3 and 4, but now by covering the contours in counterclockwise sense and registering Cartesian coordinates of the rst encountered 1.
Fig. 8. A sample-object encoded by a chain code: (a) Letter a; and (b) external contour is coded clockwise sense, while the hole contour is coded in a counter clockwise sense.
4. Application of chain codes to object shapes We encode and represent contour shapes for a variety of irregular sample objects, given in Fig. 10. The objects are composed of different number of pixels. However, to save space in this presentation, objects are presented with a scale that may not correspond to their original size. The correct
1667
Fig. 10. Thirty ve different shape objects to be represented by chain codes. Shape objects are not in original scales.
size information and other shape parameters of each object are given in Table 2. In Table 3 we include the length of each chain code applied to each object. Basically F4, 3OT, VCC3 are codes that produce the same length when visiting the contour shape in 4-neighborhood of each border pixel. While F8 and AF8 are chain codes that produce the smallest length chain code when visiting the 8-neigborhood contour shape (see Fig. 5(a) and (b)). PRT is the chain code that produces the longest chain code and it introduces some distortion in the contour representation. This is due to the manner that this algorithm follows to cover the contour shape. SF4 , S3OT and SVCC3 are about 25% longer than SAF8 or SF8 , while SPRT is about 64% longer than SAF8 . We apply the different encoding methods explained in the last section, and using each chain code presented in Section 1 on objects given in Fig. 10. The rst column of Table 3 shows perimeter-4 of each contour shape, the second shows perimeter-8, and the other two show the perimeters corresponding to VCC2 and PRT codes respectively. Fig. 11 has been obtained after applying Huffman algorithm over each chain code object. We can observe the linear behavior of each chain code rate with respect to their contour lengths. According to these experiments, the AF8 has the highest compression rate. However, it is very close to JBIG and 3OT.
The F8 encoder has the worst compression rate. The number of bytes required to encode each of the objects with the 8 methods is shown in Table 4. 5. Thresholds in length chain codes We obtain the graph given in Fig. 12 by tting a linear function to each group of points representing the different chain codes. Each line is given by the following equation: FJBIG = 0.2075x + 158.8, FAF8 = 0.2189x 2.967, F3OT = 0.2472x 7.244, FVCC2 = 0.3009x 7.577, FVCC3 = 0.2585x 0.3829, FF8 = 0.3668x 9.359, FF4 = 0.3285x + 6.158, FPRT = 0.3401x 1.983, where x represents the perimeter-8.
1668
H. Snchez-Cruz et al. / Pattern Recognition 40 (2007) 1660 1674 Table 3 Length chains, in symbols, corresponding to different chain codes Object Elephant Cobra Jaguar Horse Shark Crab Karate Racket Scissors Crank Molecule Mbius Worm Tree e p Elvis Moto Amfut Cyclist Skull Dyno Walker a d Surng Car Plane b Snake2 Snake1 Boa Tinamun Flamingo Duck P4 4244 2322 5502 3538 1962 1708 5224 1054 684 2084 6674 2782 1542 1156 292 328 1524 8080 3380 6176 9244 1240 682 324 256 4192 5186 12 684 196 1258 1176 2498 1148 1722 2004 P8 3379 1666 4041 2610 1407 1286 4362 821 551 1508 4743 1964 1107 1082 212 221 1148 5954 2647 4603 6861 940 506 235 182 3001 4160 9591 155 919 801 1905 840 1323 1424 VCC2 7478 4010 9616 6282 3410 3050 10 398 1862 1214 3584 11 392 4834 2712 2568 534 568 2662 14 472 5968 10 986 16 278 2210 1172 556 438 7386 9830 23 280 354 2236 2006 4564 1996 3016 3474 PRT 8561 4629 11 045 7063 3888 3396 11 948 2196 1356 4152 13 424 5636 3136 3048 552 680 2992 16 653 6720 12 657 18 248 2444 1364 631 512 8509 10 732 26 149 396 2520 2364 5340 2288 3540 4012
Table 2 Number of pixels, number of holes and sizes of bi-level images Object Elephant Cobra Jaguar Horse Shark Crab Karate Racket Scissors Crank Molecule Mbius Worm Tree e p Elvis Moto Amfut Cyclist Skull Dyno Walker a d Surng Car Plane b Snake2 Snake1 Boa Tinamun Flamingo Duck Pixels 150 155 31 019 298 943 42 930 36 347 4872 207 139 5886 1506 18 594 68 524 34 863 1694 1930 874 674 21 976 431 929 110 049 99 487 1 221 425 33 307 4689 723 514 60 417 185 471 551 776 614 12 473 10 361 52 761 15 472 20 703 32 224 Holes 2 1 1 2 1 1 4 2 2 8 3 2 2 2 1 1 2 24 1 13 2 2 2 1 1 4 20 10 1 2 1 2 1 2 2 Size 518 505 246 371 1036 523 393 304 302 258 152 123 602 602 95 233 67 120 193 252 595 540 444 284 98 128 113 138 41 44 50 56 189 302 960 738 285 772 527 491 1391 1333 301 168 56 172 42 49 46 55 706 468 928 320 10801 1039 33 41 289 104 253 118 348 319 187 225 217 379 231 302
List of objects given in order of appearance of Fig. 10, from left to right and from up to bottom.
F4, VCC2, and 3OT length chain are represented by P4, whereas AF8 and F8 length chains are represented by P8.
The required storage space for the different encoded contour shapes is dependent on the number of holes that each object has. The required storage space increases for chain codes as the number of holes increases. It is interesting to notice that these storage space requirements also increase for the JBIG encoder. This is due because the complexity of the bi-level image also increases. So, compression rate is directly dependent on the perimeter of the contour shape. Looking at the intersection of these lines, we found that all objects are well compressed by all chain codes while they are less than 1055 in perimeter-8 length (see Fig. 13). The objects included are: b, d, e, p, a, Walker, Scissors, Snake1, Racket, Tinamun, Snake and Dyno. After this perimeter-8 length, we found that F8 is less efcient than any other chain encoder. All chain codes are efcient to represent small gures, for example, those with sizes smaller than 60 60 pixels in resolution. The rest of the chain codes are good enough in the range 10551212 of perimeter-8 length. After 1212, PRT starts to
4000 3500 3000 2500 Bytes 2000 1500 1000 500 0 0 2000 4000 6000 Perimeter-8
Fig. 11. Number of bytes required to store every shape object as a function of its chain code length for the seven chain codes and the JBIG standard.
JBIG AF8 3OT VCC3 VCC2 F8 F4 PRT
8000
10000
12000
H. Snchez-Cruz et al. / Pattern Recognition 40 (2007) 1660 1674 Table 4 Required storage, in bytes, of different methods to encode binary objects after applying Huffman codes to all chains Object b d a p e Scissors Walker Racket Snake1 Snake2 Tinamun Dyno Tree Word Elvis Flamingo Crank Shark Crack Duck Cobra Boa Mbius Amfut Elephant Horse Surng Car Karate Jaguar Cyclist Molecule Moto Skull Plane LB JBIG 116 144 150 170 180 241 284 313 347 345 356 366 369 382 404 407 440 457 486 497 501 556 613 691 751 775 887 982 993 1001 1151 1195 1391 1453 2211 LB AF8 29 45 54 53 41 105 116 174 185 202 190 191 243 247 248 264 286 313 332 324 367 415 454 550 681 595 716 865 954 871 1018 1048 1316 1359 2190 LB 3OT 33 50 62 58 56 118 127 195 205 235 190 226 274 283 286 313 324 355 377 365 421 483 491 608 750 667 745 942 1086 1014 1101 1205 1477 1733 2408 LB VCC3 34 50 62 67 56 118 127 200 240 241 223 229 306 306 286 313 321 386 403 398 457 519 557 610 752 654 840 972 1159 1055 1199 1318 1610 1756 2455 LB VCC2 43 55 70 71 67 152 147 233 251 280 250 277 321 339 333 377 382 427 448 435 502 571 605 746 935 786 924 1229 1300 1202 1374 1424 1809 2035 2910 LB F8 52 67 86 78 74 179 174 283 295 321 314 325 374 406 399 464 467 518 558 534 618 715 737 915 1144 947 1081 1426 1634 1513 1723 1779 2187 2549 3468 LB F4 49 64 81 82 73 171 171 265 294 315 288 310 379 387 381 431 427 491 505 501 580 665 696 846 1062 886 1048 1297 1307 1377 1502 1669 2021 2311 3293
1669
LB PRT 49 64 79 85 69 170 171 275 296 315 286 306 381 392 374 442 425 486 519 502 579 668 705 840 1071 883 1063 1342 1494 1381 1583 1678 2082 2281 3269
be less efcient. The objects included in this interval are the same as before, and also Worm and Elvis. F4 and the rest of chain codes are more efcient than JBIG until the point where length chain perimeter-8 is 1261, very close to the limit of PRT. VCC2 intersects JBIG curve at 1796 perimeter-8 length, and VCC3 intersects JBIG at 3121 perimeter-8 length. This interval (from 1261 to 3121) includes: Tree, Crab, Flamingo, Shark, Duck, Crank, Cobra, Boa, Mbius, Horse, and Amfut. 3OT intersects JBIG curve at 4182 perimeter-8 length, including as well all the last samples, Surng and Elephant objects. AF8 intersects JBIG at 13 670 perimeter-8 length. However, all our sample objects are included in this interval. In Fig. 14 the relative efciency of 3OT code with respect to JBIG is presented. It can be seen that almost all the samples (30) correspond to a better compression than JBIG. Only ve objects were compressed with a better efciency than 3OT.
Fig. 14 also presents the relative efciency of AF8 code with respect to JBIG. It can be seen that all the objects are better compressed by using AF8. From this set of 35 objects we conclude that more than half of the samples (22) has an efciency between 20 and 80%; 28 samples have an efciency of more than 10%, and only ve objects were compressed with efciency less than 10%. These results tell us that there might be a compression rate threshold of our images for the different chain codes. Fig. 14 shows that compression efciency for AF8 can be approximated by an interpolated exponential function: = f egx , where f = 0.7661 and g = 0.0005622. It can be observed that this exponential function does not cross the x axis. On the other hand, the compression efciency for 3OT can be approximated by a polynomial as follows: = a 0 + a1 x + a 2 x 2 + a3 x 3 + a4 x 4 , (1)
where x is the perimeter-8, a0 =0.7736, a1 =0.0005926, a2 = 1.88 107 , a3 = 2.694 1011 and a4 = 1.335 1015 . This curve crosses the x axis in the threshold 4623.
1670
3500 3000 PRT 2500 2000 VCC2 1500 1000 500 0 1000 2000 3000 4000 5000 Perimeter-8
Fig. 12. Chain codes as functions of perimeter-8 (or LB ). Number of bytes required to store every shape object as a function of perimeter-8 chain code F8 length.
F8 F4 VCC3
Bytes
6000
7000
8000
9000
800 700 600 500 AF8 Bytes 400 300 200 100 0 -100 500 1000 1500 Perimeter-8
Fig. 13. All chain codes are more efcient than JBIG until 1006 units in perimeter-8 length.
JBIG
VCC3
2000
Now, by obtaining frequency of symbols from each AF8 chain code and the average of number of bits, we have the average code length in bits/symbol
8
LAF8 =
i=1
Li Pi = 1.73 bits/symbol,
(2)
where Li is the number of bits of each code symbol of AF8 and Pi the probability of that symbol to appear in each chain, calculated for the 35 shapes considered in this work. In Table 5 we show the details of variable-length coding calculations for Cyclist and Surng objects after applying the Huffman algorithm to the chains F4, F8, VCC3, 3OT
and AF8. We should mention that the comparison shown in Table 5 of Ref. [14] is not fair since the authors made the comparison of chain with xed-length coding versus their proposed chain with variable-length coding. However, we realize that the Huffman algorithm can be applied to all chain codes. In Table 6 we show the entropy, as calculated by Eq. (3), and the average length of each of the codes:
N
HCODE =
i=1
Pi log2 Pi .
(3)
H. Snchez-Cruz et al. / Pattern Recognition 40 (2007) 1660 1674 Table 5 Examples of variable-length coding of two objects: Cyclist and Surng, after applying Huffman to the chains F4, F8, VCC3, 3OT and AF8 F4 symbol Cyclist 0 1 2 3 F8 symbol 0 1 2 3 4 5 6 7 VCC3 symbol 0 1 2 3OT symbol 0 1 2 AF8 symbol 0 +45 45 +90 90 +135 180 Surng F4 symbol 0 1 2 3 F8 symbol 0 1 2 3 4 5 6 7 VCC3 symbol 0 Frequency Huffman Bytes (B) 1604 2 +1484 2 +1604 2 +1484 2 12 352 bits = 1544 B Bytes (B)
1671
1604 +1484 +1604 +1484 =6176 Frequency 815 +379 +703 +415 +794 +408 +666 +423 =4603 Frequency 1629 +2978 +1677 =6284 Frequency 2784 +3259 +133 =6176 Frequency 2413 +1009 +1037 +101 +40 +2 +1 =4603 Frequency 1269 +827 +1269 +827 =4192 Frequency 716 +335 +194 +305 +618 +356 +248 +229 =3001 Frequency 1230
11 00 10 01 Huffman 00 1100 101 010 111 1101 100 011 Huffman 10 0 11 Huffman 01 1 00 Huffman 1 011 00 0101 01001 010001 0100001
815 2 +379 4 +703 3 +415 3 +794 3 +408 4 +666 3 +423 3 =13 781 bits = 1722.63 B Bytes (B) 1629 2 +2978 1 +1677 2 =9590 bits = 1198.75 B Bytes (B) 2784 2 +3259 1 +133 2 =9093 bits = 1136.63 B Bytes (B) 2413 1 +1009 3 +1037 2 +101 4 +40 5 +2 6 +1 7 =8137 bits = 1017.13 B Bytes (B) 1269 2 +827 2 +1269 2 +827 2 =8384 bits = 1048 B Bytes (B) 716 2 +335 3 +194 4 +305 4 +618 2 +356 3 +248 4 +229 4 8645 bits = 1080.63 B Bytes (B) 1230 2
1672 Table 5 (continued) F4 symbol 1 2 3OT symbol 0 1 2 AF8 symbol 0 +45 45 +90 90 +135 135 180
Frequency +1776 +1242 =4248 Frequency 1720 +2412 +54 =4186 Frequency 1261 +861 +807 +25 +43 +3 +1 +0 =3001
Bytes (B) +1776 1 +1242 2 =6720 bits = 840 B Bytes (B) 1720 2 +2412 1 +54 2 5960 bits = 745 B Bytes (B) 1261 1 +861 2 +807 3 +25 5 +43 4 +3 6 +1 7 +0 5726 bits = 715.75 B
0.8 0.7 0.6 0.5 (JBIG-CODE)/JBIG 0.4 0.3 0.2 0.1 0 -0.1 -0.2 0 1000 2000 3000 4000 5000 Perimeter-8
Fig. 14. Compression rate efciency of AF8 and 3OT with regard to JBIG as a function of chain code length.
6000
7000
8000
9000
10000
It should be noted that HCODE LCODE , as it is expected. Even though, 3OT has a smaller entropy and average length, in bits/symbol, than AF8, the resulting average length in bytes is smaller for AF8. This is so, since the average length in symbols produced by the chain code AF8 is smaller than the produced by 3OT, say SAF8 < S3OT . A similar situation occurs for the rest of chain codes.
The presented chain codes are more efcient with regard to JBIG until certain length code limit. All objects are wellrepresented by any chain code when they are represented with contour under 1055 perimeter-8. For instance, after this length F8 is not recommended to represent contour objects. Despite PRTs poor represents the contour shape, it is not the worst compressor. In fact PRT remains between F4 and
H. Snchez-Cruz et al. / Pattern Recognition 40 (2007) 1660 1674 Table 6 Entropy and average code lengths of the different chain codes Code AF8 3OT VCC3 F4 F8 VCC2 PRT Entropy, HCODE (bits/symbol) 1.5481 1.1732 1.5270 2 2.9568 1 1 Average length (bits/symbol) LCODE 1.73 1.51 1.53 2 3 1 1 Average length (symbols) SCODE 2237.60 3019.26 3019.26 3019.26 2237.60 5318.06 6079.46
1673
Average length (Bytes) LB CODE 63.40 69.74 71.11 100.49 100.57 84.37 94.80
Table 7 The different chain codes presented from the best efcient to the worst, from up to bottom, respectively Efciency (From best to worst) 1 2 3 4 5 6 7 Chain code AF8 3OT VCC3 VCC2 F4 PRT F8 Number of symbols 8 3 3 2 4 2 8 Length of the chain Shortest Medium Medium Long Medium Longest Shortest Rotation invariant Yes Yes Yes Yes No Yes No
F8 in compression rate, being PRT worse than F4 and better than F8, as a result of the number of code symbols (PRT has only two). By dening S = {SP4 , SP8 , SVCC2 , SPRT } as the set of the different length chains symbols, let us consider PRT as the longest chain code. The relative sizes of chain codes in regard with PRT are given by: REL_SIZECODE = SCODE , SPRT (4)
Length perimeter used to cover the contour shape. Invariant under rotations. (b) Properties of the image: Number of pixels. 6. Conclusions In this paper we have compared and implemented different modern methods to represent binary objects, through the so-called chain codes, by encoding their contours. We have applied these chain codes to shapes with holes. All the considered objects have a discrete representation composed of resolution cells of the same size. So, chain code symbols represent vectors of unit length. In order to study efciency to represent binary objects we have compared their compression rates in a fair setting for all of them, i.e., we have applied the variable-length Huffman algorithm to all the produced chains. Every chain code was found to be more efcient than JBIG until certain complexity of the shape of each object. For instance, 3OT was more efcient than JBIG for more than 85% of sampled objects. Similarly, AF8 has the best efciency of all considered codes for 100% of the objects. In particular, most chain codes are more efcient than JBIG for shapes having smaller than 1737 perimeter-8 units. This fact was also observed for AF8 for objects having perimeter8 about 13 000 units. This threshold represents a limit to use chain codes. However, there is a huge variety of shape objects that can be encoded more efciently than JBIG by using any of these chain codes. We have found that, independently on the number of holes, a linear relationship does exist between storage memory requirements and perimeter of contour shapes.
where SCODE is the average length of objects of any of the chain codes of the set S. These averages are taken over the whole set of 35 objects. By calculating Eq. (4) on the 35 objects and using Table 3, we have: REL_SIZEP8 = 0.3691, REL_SIZEP4 = 0.5083, and REL_SIZEVCC2 = 0.8804. In the same order, we can dene the sizes as shortest, medium, long and longest. Table 7 summarizes some important properties of chain codes compared in this work. PRT is the worst way to cover the contour because it uses a very long chain, representing almost 50% longer than F4, and 64% longer than AF8 (consider Table 3), but its reduced number of code symbols tries to vindicate that shortcoming. AF8 code has three advantages making it an efcient chain code: despite the fact that its number of symbols is eight, three of them have high probability of appearance in all the objects, it has the shortest length chain and it is rotation invariant. We can say that chain code efciency is related with the next aspects: (a) Properties of the chain code: Number of symbols that have high probability of appearance.
1674
H. Snchez-Cruz et al. / Pattern Recognition 40 (2007) 1660 1674 [11] M. Hoffman, Lossless bilevel image compression, in: Khalid Sayood (Ed.), Lossless Compression Handbook, Proceedings, Academic Press, New York, 2003. [12] S. Papert, Uses of technology to enhance education, Technical Report 298, AI lab, MIT, 1973. [13] E. Bribiesca, A chain code for representing 3D curves, Pattern Recognition 33 (2000) 755765. [14] Y.K. Lui, B. Zalik, An efcient chain code with Huffman coding, Pattern Recognition 38 (2005) 553557. [15] L. OGorman, Primitives chain code, in: A. Rosenfeld, L.G. Shapiro (Eds.), Progress in Computer Vision and Image Processing, Academic Press, San Diego, CA, USA, 1992, pp. 669676. [16] M. Seul, L. OGorman, M.J. Sammon, Practical Algorithms for Image Analysis, Cambridge University Press, Cambridge, 2006 (6th printing). [17] H. Snchez-Cruz, R.M. Rodrguez-Dagnino, Compressing bi-level images by means of a 3-bit chain code, SPIE Opt. Eng. 44 (2005) 18, 097004. [18] X. Huo, J. Chen, JBEAM: multiscale curve via beamlets, IEEE Trans. Image Process. 14 (2005) 16651677. [19] M.D. Reavy, C.G. Boncelet, An algorithm for compression of bilevel images, IEEE Trans. Image Process. 10 (2001) 669676.
References
[1] H. Freeman, On the encoding of arbitrary geometric congurations, IRE Trans. Electron. Comput. EC-10 (1961) 260268. [2] J.W. Mckee, J.K. Aggarwal, Computer recognition of partial views of curved objects, IEEE Trans. Comput. C-26 (1977) 790800. [3] M.D. Levine, Vision in Man and Machine, McGraw-Hill Publishing Company, printed in the USA, 1985. [4] H. Freeman, Computer processing of line drawing images, Comput. Surv. 6 (1974) 5797. [5] F. Kuhl, Classication and recognition of hand-printed characters, IEEE International Convention Record, Pt. 4, 1963, pp. 7593. [6] R.D. Merrill, Representation of contours and regions for efcient computer search, Commun. ACM 16 (1969) 534549. [7] G.S. Sidhu, R.T. Boute, Property encoding: applications in binary picture encoding and boundary following, IEEE Trans. Comput. C21 (1972) 12061216. [8] A. Blumenkrans, Two-dimensional object recognition using a two-dimensional polar transform, Pattern Recognition 24 (1991) 879890. [9] E. Bribiesca, A geometric structure for two-dimensional shapes and three-dimensional surfaces, Pattern Recognition 25 (1992) 483496. [10] E. Bribiesca, A new chain code, Pattern Recognition 32 (1999) 235251.
About the AuthorHERMILO SNCHEZ-CRUZ received B.Sc. degree in Physics from National Autonomous University of Mxico in 1995. He received the Ph.D. degree in Sciences from the National Autonomous University of Mxico (UNAM) in 2002. He was an Assistant Researcher at the Institute of Applied Mathematics and Systems (IIMAS) at the National Autonomous University of Mxico (UNAM) where he collaborated in projects related with biomedical images and in recognition of Mesoamerican images. He is a full time Professor at the Autonomous University of Aguascalientes (UAA), in Mxico. His areas of interest are image compression, bi-dimensional and three-dimensional image recognition, and computer vision. About the AuthorERNESTO BRIBIESCA received the B.Sc. degree in Electronics Engineering from the Instituto Politcnico Nacional in 1976. He received the Ph.D. degree in Mathematics from the Universidad Autnoma Metropolitana (UAM) in 1996, he was researcher at the IBM Latin American Scientic Center, and at the Direccin General de Estudios del Territorio Nacional (DETENAL). He is associate editor of the Pattern Recognition journal. He has twice been chosen Honorable Mention winner of the Annual Pattern Recognition Society Award. Currently, he is Professor at the Instituto de Investigaciones en Matemticas Aplicadas y en Sistemas (IIMAS) at the Universidad Nacional Autnoma de Mxico (UNAM), where he teaches graduate courses in Pattern Recognition. About the AuthorRAMON M. RODRGUEZ-DAGNINO has been a full-time Professor at Tecnolgico de Monterrey (ITESM) from 1993 to the present. He obtained his Ph.D. degree from the University of Toronto, Canada, in 1993. His research interests are: video compression, stereoscopic video processing, multimedia services transport in high speed networks, performance analysis, and electromagnetic theory. He has been part of the technical committee of the SPIE conference Information Technology and Communications (ITCom) ve times. He is an IEEE Senior Member, Mexican Academy of Sciences member, and National Researchers System member (Nivel II). He has published more than 20 journal papers.