1 Introduction

figure mandelbrot_set.png
Figure 1 The Mandelbrot set
In this project, we create and display the Mandelbrot set (the black points in the above figure), a math set that satisfies
(1) M  = {c ∈ C|limn → ∞|Zn| ≠ ∞}        Zo  = c       Zn + 1  = Z2n + c , 
in which M is the Mandelbrot set and C is the set of all complex numbers.
Creating the Mandelbrot set based on the equations above is impossible, for the ranges of both n and |Zn| in the first equation are infinity instead of specific numbers. However, we can narrow the ranges down to a reasonable scope for the numerical implementation of the Mandelbrot set. Heuristic experiments and deductions have shown that a number c belongs to the Mandelbrot set if the following equations are true:
(2) M  = {c ∈ C||Zn| ≤ 2,  n = 0, 1, ⋯1999}        Zo  = c       Zn + 1  = Z2n + c .
That is, we can categorize a complex number c as a member of the Mandelbrot set if the first equation holds within the first 1999 iterations of the third equation; otherwise, this number does not belong to the Mandelbrot set. Even though the aforementioned specifications change the implementation of the Mandelbrot set into a feasible task, this task is still computationally expensive.
In this project, various libraries are used for the computation and display of the Mandelbrot set: MPI for distributedly computing the Mandelbrot set, OpenGL for displaying the set, POSIX Threads for handling events input by users, such as mouse clicking and region drawing, and STL container for storing intermediate computation results.

2 Demonstration

figure MBS_demo1.png
(a)  → 
figure MBS_demo2.png
(b)  → 
figure MBS_demo3.png
(c)  → 
figure MBS_demo4.png
(d)  → 

figure MBS_demo5.png
(e)  → 
figure MBS_demo6.png
(f)  ← 
figure MBS_demo5_5.png
(g)  → 
figure MBS_demo6_5.png
(h)  → 
Figure 2 Visual details of the Mandelbrot set. The symbol “ → ” indicates that the drawn square (red) is zoomed in, yielding the pattern in the following figure. By contrast, the symbol “ ← ” indicates the opposite operation of “ ← .”
To see the entire set, we initially limit the display area (the contents in the window in (a↑)) to be
 − 2.0 ≤ Re(c) ≤ 1.0
and
 − 1.2 ≤ Im(c) ≤ 1.8.
This area is equally discretized by a finite number of samples (512 × 512). A user can zoom in on any local region on the display area to see the details by drawing a square (shown in red) to define the region of interest. While the user clicks on the viewport and drags the mouse cursor over a range, OpenGL Utility Toolkit (GLUT) is responsible for detecting the positions of the mouse cursor in terms of pixels; then, the boundaries and positions of the drawn square will be converted into the corresponding complex coordinates for subsequent calculation and visualization.
figure perspectiveMBset.png
Figure 4 Double color buffers. The front buffer is shown on the left. The other is the back buffer. In this project, the back buffer is configured as the backup storage for the contents that will be superimposed by the drawn square in the front buffer.
When an application is required to yield smooth animations, the OpenGL frame buffer is deliberately configured to have double color buffers (front and back). While the rendering in the front buffer is shown on the screen, the contents that will be subsequently displayed are being rendered in the back buffer. When the contents on the screen need refreshing, the roles of the two buffers are swapped. As such, users can see a smoother animation than is the case when only one color buffer is used.
In this program, we still set double color buffers; however, they are not configured as mentioned above. Instead, they are utilized to improve the rendering efficiency with the trade-off of slight loss of visualization quality while the red square is being drawn. Note that the red square is superimposed on the existing contents in the window, so only the pixels on the boundaries of the square are updated to red. Hence, we 1) disable the swapping between the front and back buffers, 2) use the back buffer to store the existing contents (the rendering without the superimposed red square, as shown in the back buffer of 4↑), 3) copy the contents from the back buffer to the front one, 4) superimpose the red boundaries onto the copied contents, and 5) constantly refresh and display the contents in the front buffer on the screen.
Once the drawing is finished (the user releases the pressed button on the mouse), the program immediately rediscretizes the drawn region into 512 × 512 samples, checks which samples belong to the Mandelbrot set, paints samples with corresponding colors (described later), and refreshes the contents in the window for display. As mentioned earlier, if the corresponding complex coordinates of a sample iteratively satisfy the equations (2↑) for a full 2000 iterations, this sample belongs to the Mandelbrot set and we paint it black. If any of the equations does not hold at the Nth iteration (N < 2000) when the corresponding complex coordinates of a sample are substituted into the equations (2↑), then this sample does not belong to the Mandelbrot set. We paint all samples like this with colors according to various N values. To sum up, in the figures above, all points with the same color require the same number of iterations to verify that they do or do not belong to the Mandelbrot set.
The computation cost of validating 512 × 512 samples is high, so once a result is obtained, it is stored in an STL vector (used as a stack) while the user proceeds to zoom in on the complex plane. As such, a user can undo the rendering of the previous zooming level (as changes from Figs. e↑ to g↑) without resorting to recomputation: A region is selected on Fig. e↑; after computation, the result is shown in Fig. f↑. If a user is not satisfied with the region of interest in Fig. e↑, the user can return back to the previous result and select a different region (as shown in Fig. g↑), yielding the result in Fig. h↑.

3 Algorithm design

figure MB_set_illustration3.png
Figure 5 Distributedly validate which samples belong to the Mandelbrot set
This figure illustrates how the required computation for validating 512 × 512 complex numbers is distributed to a number of processors. When the program launches, a processor (processor 0 in Fig. 5↑) is responsible for listening to the events created by the user. Hence, processor 0 runs the OpenGL Utility Toolkit (GLUT) using one thread (thread 0 in the figure) to handle the input events. Meanwhile, other processors (processors 1, 2, and 3) simply wait for the data passed from processor 0 using block receives (explained later).
Once a user finishes drawing a red square on the window, processor 0 creates one more thread (thread 1 in the figure) to 1) convert confined image pixels in the drawn square to the corresponding complex numbers, 2) segment the total complex numbers into parts and send each part to a processor for computation through nonblocking sends, 3) attempt to receive the computation results passed from other processors using nonblocking receives, 4) processes one part of complex numbers, and 5) “MPI_Testall” to ensure that results from other processors are successfully received. Meanwhile, processors 1, 2, and 3 process the data sent from processor 0 and send the computation results back to processor 0 using blocking sends.
Note that the blocking send and receive in the implementation of the MPICH2 MPI library (developed by Argonne National Laboratory) use busy-waiting. That is, computation slaves (processors 1--3 in Fig. 5↑ or CPUs 1--7 in Fig. a↓) will be staying in full load while they execute the blocking send and receive of the MPICH2 MPI library to wait for the exchanges of data. The remedy for this unfavorable waste is to use the built-in nonblocking send and receive, a while loop, and the usleep function to implement a customized blocking send and receive. Fig. b↓ exhibits the usage of CPUs after this strategy is applied to the slave processors. As shown, processors are in full load only when they perform computation. Otherwise, they remain idle.
figure cpu100always.png
(a) All CPUs except the one responsible for listening to the input events are in full load when the MPICH2 blocking send and receive are executed.
figure cpu100.png
(b) With our customized blocking send and receive, CPUs are in full load when they perform computations.
Figure 6 CPU uses

4 Miscellaneous

figure MBset_inheritance.png
Figure 7 Inheritance of data structure
Using inheritance when coding the program of this project can systematically simplify the structure and increase the reusability of the code. We can set the drawn red square (referred to as “complex region boundaries,” the top block of 7↑) as the base class of the inheritance mechanism and create a derived class accounting for the data passed between processors (referred to as “partial complex region,” the middle block). We can even design a class representing the collection of all distributed data (shown at the bottom) using the inheritance from the partial complex region.