1 Introduction

figure US_grayscale.png
(a) Original image (grayscale)
figure us_test_20.png
(b) Output image (20 regions)

figure us_test_06.png
(c) Output image (six regions)
figure us_test_04.png
(d) Output image (four regions)
Figure 1 Regions with similar pixel intensities on a grayscale image are sequentially merged to produce a cartoon-like image that resembles the original one.
In this project, we design an algorithm to sequentially merge adjacent regions with similar intensities to generate a cartoon-like image that resembles the input grayscale one. Initially, each pixel on a grayscale image is treated as a region; at each iteration, only the region pair with the lowest discrepancy measurement (defined later) are selected to be merged. When two regions are merged together, the colors of the pixels in the original regions are replaced with the mean pixel intensities of the newly combined one. Therefore, as shown in Fig. 1↑, after more and more iterations of merging proceed, the output image contains fewer and fewer growing homogeneous regions.
The discrepancy measurement between regions i and j is defined as follows:
D = SiUi + SjUj − SijUij.
In this definition, Si is the sum of image pixel intensities, Ai the number of pixels, and Ui (Ui = (Si)/(Ai)) the mean pixel intensity, all in region i. In addition, subscript ij indicates the newly generated region through the of regions i and j. According to this definition, any two neighboring regions share a D value, so efficiently locating the smallest member among all D’s is an important issue. Moreover, when two regions are merged, the information stored in the neighbors, such as discrepancy measurements, should be accordingly updated. Furthermore, the container storing the information of regions should be designed to exchange data quickly. Based on the aforementioned concerns, we adopt a special data structure in this project: a generic doubly linked list.

2 Doubly linked list

A doubly linked list can be visualized in Fig. 3↓.
figure general_linkedlist.png
Figure 3 Doubly linked list. The “current” pointer is designed for sequentially accessing of all nodes.
Compared to using arrays to store data, users do not have to predict the memory usage required by a linked list in advance. In addition, inserting a node into or deleting a node from a linked list is efficient. Most importantly, the concatenation of linked lists can be simply implemented using a few lines of code. Based on these merits (especially the last one), a linked list is used to store all regions of a grayscale image, as shown in Fig. 4↓.
figure PC_web/pics/regions_stored_in_linkedlist.png
Figure 4 A linked list storing all regions on an image

3 Region merging

A region merging case is illustrated in Figs. 7↓ and 8↓. Fig. 7↓ the situation before two of the regions with the smallest D are merged. Fig. 8↓ shows the result of the region merging. To facilitate region merging and information updating, each region is equipped with the following information:
1) One doubly linked list for storing the indices of pixels categorized into this region (placed at the top in each ellipse in Figs. 7↓ and 8↓)
2) One doubly linked list for storing the pointers referencing to the neighbors of this region (placed at the bottom in each ellipse)
3) One pointer referencing to the neighbor that has the smallest discrepancy with this region (arrow in red)
4) The smallest discrepancy measurement (shown in each ellipse)
5) Si, Ai, Ui, and other useful auxiliary information (not shown)
figure region_all_nodes.png

Figure 7 The demonstration of regions on an image. Each region, stored as a node of doubly linked list of Fig. 4↑, is equipped with some data structures to facilitate the merges of regions. In this example, because regions 1 and 2 the lowest discrepancy measurement among all pairs, they are chosen to merge each other, yielding Fig. 8↓.
figure region_all_nodes(reduced).png
Figure 8 The results after regions 1 and 2 are merged to produce region 1. Note that not only the data originally stored in regions 1 and 2 but also the data stored in their neighbors need updating based on the merging.
In our example shown in Figs. 7↑ and 8↑, regions are marked with numbers to elucidate their roles in the diagrams. At each iteration of region merging, the region pair with the lowest discrepancy measurement (regions 1 and 2 in Fig. 7↑) is selected to merge together to produce a new one (region 1). As such, the total number of regions on a grayscale image will reduce by one after each iteration.
When regions 1 and 2 are merged together, the following operations are executed:

4 Min-heap and back pointers

As mentioned in the first paragraph, the minimum is selected out of all discrepancy measurements at each region-merging iteration. Although we can simply loop through all nodes to find the minimum (easy but inefficient), we can adopt a min-heap with back pointers to boost the efficiency of locating the lowest discrepancy measurement among all region pairs.
Fig. 9↓ illustrates how a min-heap (the binary tree in the middle) and back pointers (pointers stored in array “bckptr” on the left) are connected. The min-heap (implemented by array “min-heap”) traces each entry in array “data,” in which each entry is bound with one element in array “bckptr.” As such, users can locate the tracer of a certain data element in the min-heap by simply tracing the corresponding back pointer of the data element.
The advantage of combining a min-heap with back pointer is that the back pointers improve the efficiency of maintaining the min-heap. Imagine that element 0 in array “data” were changed to 2.7. Without the existence of the back pointers, users must exhaustively search the min-heap to find the tracer of element 0 and subsequently rearrange a number of nodes to reflect the change of element 0. However, with the help of back pointers, users can directly access the tracer of element 0 in the min-heap without wasting CPU cycles to search the entire min-heap.
figure PC_web/pics/minheap_bckptr.png
Figure 9 Min-heap with back pointers. Users can locate the tracer of the first data element (1.7) in the min-heap (5) by using the back pointer bound to the first data element.

5 Associating min-heap and the into this project

The min-heap and back pointers in Fig. 9↑ require modifications when integrated to trace the discrepancy measurements of in all nodes. As shown in Fig. 11↓, elements in array “min-heap” trace the discrepancy measurements using pointers (symbolized by figure pointer.png ). Back pointers (symbolized by figure bckptr.png ) and discrepancy measurements are always bound together, so they are declared as the data members of regions.
The significance of the back pointers in this application is illustrated in Figs. 11↓ and 12↓ (note that these two figures correspond to Figs. 7↑ and 8↑). When regions 1 and 2 are merged to form region 1, the discrepancy measurements of the neighbors of 1 (regions 0 and 3) are changed correspondingly (D values painted with cyan), which triggers the rearrangement of some elements in the min-heap. With the back pointers, the elements whose positions in the min-heap need changing can be directly located (through cyan arrows).
figure linkedlist_minheap_backpointer_integration.png
Figure 11 Min-heap, back pointers, and regions. Back pointers are symbolized by figure bckptr.png . Note that this figure reflects the situation of Fig. 7↑.
figure linkedlist_minheap_backpointer_integration(update).png
Figure 12 The merge of region 1 and 2 change the D values in regions 0, 1, and 3, whose tracers in the min-heap may accordingly require relocating. With the help of back pointer, the complexity of this operation is O(1). Note that this figure reflects the situation of Figure 8↑.