The World Through My Lens: Activity 5: Fourier Transform Model of Image Formation

Fourier Transform (FT) decomposes a signal into its component frequencies. This process "transfers" us into the frequency domain from any physical domain of a signal i.e time and space domain. FT gives a complex-valued function of frequencies whose the absolute value indicates the magnitude of a given frequency and the complex argument indicates the offset on the signal phase.

The Fourier Transform of a signal is given by the integral

\[\begin{equation} \label{eqn:FT} F(f_x, f_y) = \int \int f(x,y) exp(- i 2\pi (f_{x}x + f_{y}y)) dx dy \end{equation} \].

where $f_x$ and $f_y$ are its frequencies along the orthogonal directions.

In this activity, we explore the FT and its operations and its uses in image processing.

For the first part, we got to familiarize ourselves with the Fast Fourier Transform (FFT) algorithm which was developed by Cooley and Tukey. FFT computes for the Discrete Fourier Transform or DFT which converts a finite-sequenced signal whose samples are equally spaced into a sequence of complex value function of frequency with equivalent length. The Cooley-Tukey algorithm recursively breaks down a DFT of composite size $N = N_1N_2$ into smaller DFTs. Sounds complicated. Luckily, Scilab has functions fft2() and fftshift() which interchanges the quadrants of the signal and back respectively.

Using the knowledge I acquired from Activity 3, I created a 128x128 image of a circular aperture whose radius $r = 0.1$.(Note: The size is consistent for all the images used in this entry since FFT algo works best for images whose dimension are in powers of 2.)

Figure 1. Circular aperture with r = 0.1

Fig.1 which shows a circle centered at the origin with $ r = 0.1$ was used as our first test object. We apply FFT on the image and later the inverse to recover the original image using the code below.

Figure 2. FFT Code

After reading the image using imread, we FFT-ed the image using the function ff2(). Since imshow() can only show 8-bit grayscale images,we included an extra step (Line 16 of the code in Fig. 2) for us to be able to display the resulting images.

Figure 3. a) Input image b) FT of image c) FT shifted d) Inverse FT

Fig. 3 shows the (a) input image and the (b)-(d) resulting patterns after it has undergone the process of Fourier transform. As seen FT interchanges the quadrants along the diagonals as seen in Fig 3b and the fftshift() function interchanges them back as seen in Fig.3c. Also in 3c) we note that the FT of the input circle results into concentric circles with varying intensity that seem to radiate from the center (which has the greatest intensity). In optics, for a lens system, these concentric circles represent interference patterns. Applying the inverse FT only returned the original image as seen in Fig.3d. The inverse FT is expected to give an inverted image of the original but since the circle is symmetric in both directions, we seem to recover the input image. Since we know that the FT returns a complex-value function, we display its real and imaginary components.

Fig.4 shows the real and imaginary components of our input image which is the circle. Notice that the real part resembles the FFT of the input. On the other hand, the imaginary part seem to be a zoomed-in version of it only that, there is a discontinuity across it.

Figure 7. Real part (L) and Imaginary part (R) of the cirlce.

Since the circle is a symmetric object, we did not exactly see the expected effect of the inverse FT. Now, we apply the same process (as done in the code in Fig.2) to the letter "A" as the aperture.

Figure 5. Letter A.

Below are the results.

Figure 6. a) Input image b) FT of image c) FT shifted d) Inverse FT

Notice that the FT of the image in Fig.6c displays symmetry along the x-direction. I suppose that the observed symmetry for any input image is dependent on the shape of the object itself given we are looking at it in the spatial domain. Also, we now see that the inverse FT recovers the original input image only that it is inverted. Similar to the circle, we also took the real and imaginary components of the letter A.

Figure 7. Real part (L) and Imaginary part(R) of "A"

As noted earlier, the FFT returns a complex-valued function which means it is composed of real and imaginary parts. The image that we see is the composite of the magnitude and phase of different frequencies. For the letter "A" the real part also resembled the FFT of the original image. It is symmetric along the x and y direction. Moreover, the imaginary part as seen in Fig.7(R), looks like a close up version of the imaginary part.

We also applied the FT to other patterns namely a)Corrugated Roof b)Simulated Double Slit c)Square Function and d)2D Gaussian Bell. For all these patterns, we only applied the fft2() and fftshift() functions since the double fft2(fft2(image)) will only give us the inverted input. These patterns will serve as optical apertures (Fig8a - 11a) and we will find out the resulting patterns (Fig8c-11c) as light passes through them. We will not give much focus on Fig.Xb since the patterns are unclear and are found almost at the edge as indicated by the red arrows.

Figure 8. Corrugated Roof

Fig.8c shows white dots whose intensity diminishes as it goes farther away from the middle. We could think of it this way: A coherent monochromatic light source is positioned orthogonal to the corrugated aperture. As light hits on the slopes for the side slits, light is diminished as interference happens and the one in the middle remains focused.

Figure 9. Simulated Double Slit

We recover the interference pattern observed in the Double Slit experiment. It is not very clear, but Fig.9b shows a repetitive pattern along the horizontal. What's not seen are the ones along the bottom part of the image. The simulated double slit aperture was succesful in producing the expected interference patterns.

Figure 10. Square Function

Just like in the first part of this activity, we observed concentricity for the square function aperture (Fig.10). We see lines radiate from a point along the x and y directions. If we zoom-in into Fig.10c, we will realize that the white spot is square in shape only that the edges seem to be smeared.

Figure 11. 2D Gaussian Bell

Lastly, we consider a 2D Gaussian bell aperture. One obvious characteristic of the gaussian bell is that it is most intense in the middle and slowly transitions to lower values ($\rightarrow 0$) as it goes away from the center. If we look closer at Fig.11c (encircled with red broken lines) we will see that after applying FT, we get a white dot in the middle. The interference patterns are no longer observable! No surprise, we know that the highest pixel value is found in the middle such that if we find apply the abs() function which gives us the magnitude, it will give us the point with highest value only.

Linear operations in one domain e.g. time domain has a corresponding operation in the other domain i.e. frequency domain and vice versa. One such operation is convolution which is basically the distortion of a function to conform to the form of another function such that the resulting function looks a little like both.

\[\begin{equation} \label{eqn:convolution} h(x,y) = \int \int f(x', y')g(x-x', y-y')dx' dy'\end{equation} \]

where $h(x,y)$ is the convoluted resulting function of two 2D-functions $f$ and $g$.

\[\begin{equation} \label{eqn:convolution2} h = f*g \end{equation}\].

As seen in Eqn. 3, (the shorthand expression for Eqn.2) convolution is done by simply multiplying the two functions. Consider Figure 12, a monochrome logo of the acronym "VIP" in bitmap format as one function, say $f(x,y)$ and is our input object. Whatt we want to do for this section is to see what happens if we make "VIP" pass through a circular aperture which we consider to be an imaging device or a black box. Let the aperture be $g(x,y)$. Figure 13 shows the results of the convolution between the two for varying aperture radii $r = [0.025,0.05,0.1,0.3,0.5,0.7,0.9]$.

Figure 12."VIP" in monochrome bmp. format

By the way, here is the code I used for convolution.

Figure 13. Code used to find convolution

Figure 14. Circular apertures of varying radius and resulting convoluted images.

The effect of the aperture size is apparent as seen in Figure 14. As the radius is increased, the resulting image becomes more defined as we would expect on the actual setting. The bigger the aperture of the camera, more light comes in resulting into a better photo.

Figure 15. Spatial functions $f$ and $g$ we want to correlate

Another operation that we could do on two spatial functions $f(x,y)$ and $g(x,y)$ is to find their correlation or the degree of similarity between them. Mathematically, correlation is defined as

\[\begin{equation}\label{eqn: convolve}p(x,y) =\int \int f(x',y')g(x+x',y+y')dx' dy' \end{equation}\].

In a more practical perspective, this operation can be used in recognizing similar patterns in text or images and also in template matching. Consider the two images in Fig.15 two the functions $f$ and $g$ respectively. The goal is to find $g$ given as the letter A in $f$ which is the statement.

To execute the process, here is the code:

Figure 16. Code to find correlation

Figure 17. Main text (L), Correlation result (M) and mapped location of similar patterns in the text (R).

Fig.17(M) shows the correlation between the images in Fig.15. The Fig.17b is a distorted chromatically inverted image of the original text in Fig.15(L) with five white spots whose intensity is relatively greater than its surroundings. By eye inspection, we could confirm their locations in Fig.17(L) by comparing it with the rightmost image.

Now that we are familiar with the process of fourier transform and convolution and correlation (with correlation looking similar to convolution only that we take the conjugate of one function), we explore another useful application that is a composite result of the three--edge detection. Using the same spatial function $f(x,y)$ (or the VIP logo), what we want is to detect its edges using a given matrix a) horizontal b) vertical c) diagonal d)spot e) random whose elements sum to zero. These matrices when converted into images themselves will act as another spatial function $g(x,y)$.

Figure 18. Edge detection code using different pattern matrices

Figure 19. Pattern matrices and resulting edge images

Recognize that the pattern is decided by the magnitude of the element. For instance, the horizontal pattern matrix (Fig.19a) has 2's lined up in the middle row and all the surrounding elements are -1. The same goes for the vertical (Fig.19b) and diagonal (Fig.19c) patterns. Note that the edges that are best recovered has orientations corresponding to the matrix. Thus, for the horizontal pattern matrix, it recovered the horizontal lines (or pixels) in the "VIP" very well. The same goes for the vertical and diagonal (most especially in the letter V) which is basically two diagonal lines meeting at the center. The spot pattern matrix (Fig.19d) recovered all the edges nicely compared to the random matrix in Fig.19e which returned thicker yet ragged edges.

Figure 21 Non-zero matrix

I was curious as to why the sum of the elements for edge-detection matrices should be zero so I tried out a random non-zero matrix and I got the result in Fig.20. Right, a non-zero matrix would give considerable magnitude to each pixel value so that it shall recover the entire VIP logo. Haha, silly me!

I would like to acknowledge my classmate, Carlo Solibet for teaching me the function imwrite() in Scilab which would automatically save my images to my computer. It greatly took out a large portion of work! I used to take snapshots of my screen, transfer them to paint and save the image. :)) Now, I know better. And also, he has greatly helped me debug my code whenever I get weird looking plots.

References:

[1] M. Soriano, "Fourier Transform Model of Image Formation", Activity 5 Manual. (2016)

The World Through My Lens

Monday, September 12, 2016

Activity 5: Fourier Transform Model of Image Formation

No comments:

Post a Comment

Blog Archive