Wednesday, December 7, 2016

Activity 11: Basic Video Processing


This activity is the last of my Applied Physics 186 activities. Here, we try to apply all of the image processing techniques we've learned during the sem to a sequence of images that are played continuously more commonly known as a video to extract information like the position of a ball in time. 

Consider a simple mechanics problem shown below: A ball is released from one end of an inclined plane. Find a) Its position through time b) The acceleration due to gravity, $g$.

We need to isolate the ball using color segmentation. Vid.1 is a 5-second video we want to process. We take each frame as an image and try to isolate the ball by taking a patch on the surface of the tennis ball as our ROI and then apply Parametric Segmentation. To know about this said technique, you can revisit my Activity 7 here. 





After the parametric segmentation, there are still some other blobs we want to disregard completely. To do this, I did morphological operations to remove these blobs. First, label each blob using the SearchBlobs() function and then, try to filter each using FilterBySize(). The following video shows the isolated ball.


Below is the code I used to isolate the ball and compute for the centroid.




Now that we have one single blob to track, we find its position by calculating for its centroid. The centroid is the coordinates of the blob's center of mass in the image so that it is in x and y. The problem can be solved using polar coordinates for ease in computation but since the centroid for every frame is an ordered pair of (x,y) we first use cartesian coordinates and then convert into polar. The following images contain the schematic diagram of the problem and also the derivation of the important equations.
Figure 1. Schematic diagram with the origin at the top of the ramp. Polar and cartesian relations are also derived.

We consider the origin to be at the top of the ramp inclined at an angle $\alpha$ and has a length $s_{max}$. The relationship of the cartesian to polar quantities are also shown.

Figure 2. Energu relations for a hollow ball from which we derived the velocity $\dot s$ at the bottom of the incline

We consider the tennis ball to be a hollow sphere with a moment of inertia $I = \frac{2}{3}mr^2$ and that its motion includes rolling without slipping along the incline. From the energy relation which shows the conservation of energy where the potential energy is equal to the kinetic, we can derive its final speed $\dot s$.

Figure 3. We take the derivative  of the expression for the velocity $\dot s$ in Figure 2 to get the acceleration of the ball along the ramp and then use trigonometry to get the acceleration along the horizontal.

Figure 4. The position of the ball per frame is tracked and then plotted against time

Images were extracted from the video with rate of 25 frames/second. For each frame the position of the ball is extracted. But the pixel location plotted against time is meaningless so we need to convert it into real physical units. To do this, we compute for the calibration factor $C$ which determines the length covered by a pixel unit. Shown below is the computation.

$L_{actual} = 7 ft$

$end_1 = (828,35)$
$end_2 = (5,104)$

Since the length is oriented along the diagonal, we cannot simply subtract the coordinates like we did in Activity 2. To find the length of the incline, we employ the distance formula.

$L_{image} = end-to-end distance = \sqrt{(825-5)^2 + (5-104)^2} = 825.887 px$

7 ft. = 825.887 px

$(7ft.)\left(\frac{12 in.}{1 ft}\right)\left(\frac{2.54 cm.}{1 in.}\right)\left(\frac{1 m}{100 cm}\right) = 2.1336 \, m$

$C = \frac{2.1336 m}{825.887 px} = 2.58 \times 10^-3 \,  \frac{m}{px}$

And the incline is tilted at an angle of $\alpha = \arctan\left({\frac{|35- 104|}{|828-5|}}\right)= 4.79224 ^{\circ}$

Now we know that a pixel represents a distance of $2.58 \times 10^-3 \,  \frac{m}{px}$. The next thing we need to do is to multiply all the x-pixel coordinates with this factor to get the position and space and then plot this against time mutiplied to $\frac{1}{25}$ which represents the time per frame.
Figure 5 shows the plot of the balls position in time as extracted from the video frames. Now we can compare the coefficients of the fit with the theoretical as shown in Figure 4.



Figure 5. Position of ball in time with quadratic fit


We know that the concavity of the curve represents the acceleration $\ddot x$ of the ball so that, $\ddot x = 2a = \frac{3}{10}\,g\,\sin\alpha$. Isolating $g$, we get


$g = \frac{20\,a}{\sin 2\alpha}$ where a  is the coefficient of the quadratic fit in the experimental plot of the position vs. time shown in Figure 5. The calculated value for the acceleration due to gravity is $g = 9.961863 \frac{m}{s^2}$ which deviates from the theoretical value by $1.5 \%$.

This has been quite a culmination of the image processing skills I acquired for the entire sem and a refresher of my mechanics. For this activity, I will rate myself 11/10 since I was able to effectively apply image processing techniques and  accomplish my objectives completely. And also, I believe the images (Figure 1-4) used to illustrate the problem involved much effort in creating and also they're neat! :D

References:

[1] M. Soriano, "Basic Video Processing", AP186 class manual. (2016)







Tuesday, December 6, 2016

Activity 10. Enhancement via Histogram Manipulation


An image histogram represents the frequency distribution of pixel values in an image. It tells us about the spread of the colors in an image; take for example black (value = 0) and white (value = 1) for grayscale images. The image histogram is actually the (graylevel) probability  density function (PDF) of the image. In some cases, images are dark and low contrast due to low exposure and poor lighting conditions and a histogram represents this by having peaks clustered at lower values of the histogram.

In order enhance low quality images,their histograms are manipulated. This is done  by remapping the Cumulative Density Function or CDF of the poor image with new grayscale values from a desired CDF. Suppose the graylevels r of an image has a probability distribution function (PDF) given by $p_1(r)$ and  the cumulative distribution function is given by 

\[ \begin{equation} \label{cdf} T(r) = \int_{0}^{r} p_1 \, g \, dg \end{equation} \]

where g is a dummy variable.

We want to map the r's to a different set of graylevel z's such that the new image will have a CDF given by \[\begin{equation} \label{cdf2} G(z) = \int_{0}^{z} p_2(t)\,dt \end{equation}\] where $p_2(z)$ is the PDF of the transformed image and t is a dummy variable. 

Figure 1 best explains the steps to be taken.
Figure 1. Steps in altering the grayscale distribution. (1)From pixel grayscale, find CDF value.(2) Trace this value in the desired CDF. (3) Replace pixel value by grayscale value having this CDF value in desired CDF(4).




Now that we know the steps, consider a picture of me with my research ate, Anjali Tarun at the Dome of Light in Taiwan during the Physics Society of the Republic of China (PSROC) Annual Meeting. This was taken using a phone camera. Because the background is really bright, we non-bioluminiscent creatures looked super dark! 



Figure 2. Ate Banana and I at the Dome of Light

For this picture and others, I considered three desired CDFs namely linear, logarithmic and parabolic as shown in Figure 3. I considered the last two functions because the human eye, as we know, has nonlinear response (more specifically, it reacts logarithmically).

Figure 3. Desired CDFs (from L-R): Linear, Parabolic and Logarithmic


Below is the result for my first image after remapping the CDFs with the desired ones previously shown in Figure 3. 

Figure 4a is the grayscale image of Figure 2. Figure4b-4d are the enhanced images using the Linear, Logarithmic and Parabolic CDFs. Figures 4e-4h are their respective PDFs for us to see how the images have improved while Figures 4i-4l are the CDFs. The same layout is done for the other images considered.

Figure 4. Results for Figure 2. (a) Grayscale image of the picture we want to enhanced while (b)-(d) are the resulting enhanced images using linear, logarithmic and parabolic CDFs. The PDF for each enhanced image was also shown by (e)-(h) to see how it improved by the change in the spread of the pixel values while (i)-(l) are the corresponding CDFs we want to remap.


We shall see that visually, the linearly and parabolically enhanced images looked comparable with each other but upon seeing the enhanced PDFs we see that the pixel values looked more spread out and varied more evenly across using the linear function.

Now consider a quick selfie with my close friend Ace who went out making digma with traffic to come see because I said I was sad.(Felt blessed beyond measure for the people in my life like Ace <3 ) We didn't care about the lighting basta may selfie and so again, we looked darker so it's not Instagram-able. :((

Figure 5. Quick selfie with Ace
The same procedure is applied to Figure 5. And the results are shown in Figure 6.

Figure 6. The same procedure applied on Fig. 5. It looks as though the parabolic function has the same effect as the linear.
Similar to the previous result, it seems that the result for linear is comparable with the parabolic but upon closer inspection, we shall see that the result for the logarithmic seems to have over exposure patterns on the surface.

Enhancement via histogram manipulation is not limited to grayscale images. It can also be done on colored images like the one in Figure 2. In the rgb manipulation, we manipulate the intensity I of the image and what I got is shown in Figure 7.
Figure 7. Enhanced colored image via RGB manipulation

For comparison, I also used GIMP 2.0, an image processing software that serves many functions, to enhance Figure 2. I set the curve to be logarithmic and the result is Figure 8. 

Instant whitening happened to us! *O*

Figure 8. Enhanced via GIMP

For this activity, I give myself an 8.5/10

Acknowledgements:

Thank you Angelo Rillera  for the helpful discussions!

References:

M.Soriano, " Enhancement by Histogram Manipulation",AP 186 class manual.(2016)