Computer Vision 基础学习

 


Knowledge and thinking

Formats of images

      There are two different formats for saving an image: continuous and discrete images. The continuous image is the original format for the camera. The discrete images can be obtained by sampling continuous images. At first, the 3D object will be captured by the camera. The image is continuous because of the optical features of the lens. But the computer can only process discrete data. Therefore, sampling is a necessary step before saving on the computer. To ensure that the sampled image can keep the most important information, the sampling rule. \( \omega_s \geq \omega_m \) should be considered. For a one-dimensional signal low sample rate leads to aliasing in the frequency domain. Correspondingly, the signal can not be reconstructed by the interpolator. For images which are two-dimensional signal, the aliasing effect can be easily captured: Moire Pattern by observing the images. There are two solutions for avoiding aliasing. An intuitive way is to increase the sampling rate to satisfy the sampling theorem. Another way is to use a lowpass filter. This is a common way to solve signal aliasing. But the disadvantage is that the high-frequency information can not be kept. If our goal is image compression, this is a convenient way to avoid aliasing. If we want to extract the edge information, we can not cut the frequency because the high frequencies sometimes indicate edge information. 

                                                                                                                                                                                          Figure 1 Moire Pattern

Extraction of edges

       The task of extraction edge information is essentially calculation of the image gradients. For an image, there are two direction gradients: along x-axis and y-axis. Notation as \( I_x(x,y) \) and \(I_y(x,y)\). A widely used function to calculate image gradients is Sobel Filter.  According to my understanding, there are two different interpretations. 

one naive way of calculation image gradients  is \( \frac{\mathrm{d} I(x,y)}{\mathrm{d} x}\approx I(x+1,y)-I(x,y) \) \( \frac{\mathrm{d} I(x,y)}{\mathrm{d} y}\approx I(x,y+1)-I(x,y) \) for example

 

Figure 2 calculate of gradient z5

\( g_x(z_5) = z_8-z_5 \) 

\( g_y(z_5) = z_6-z_5 \) 

     But not only \(I(x+1,y)\) or \(I(x,y+1)\) but also the indenstities of  the pixel in some other positions of the block have significant effects for image gradients. Such as \(z_7-z_5\), Sobel filter can be interpreted as the sum of differences between different pixels with different weights. Acoording to Sobel filter, \( g_x = (z_7+2z_8+z_9)-(z_1+2z_2+z_3) \) and \( g_y = (z_3+2z_6+z_9)-(z_1+2z_4+z_7) \)

 Another interpretation of Sobel filter is from sampling and interpreter. 

\( \begin{aligned} f^{\prime}(x) & \approx \frac{d}{d x}(f[x] * h(x)) \\ &=f[x] * h^{\prime}(x) \end{aligned} \)  

      The derivative of a signal can be interpreted as  the sampled signal convolution with the derivative of the interpolater. Correspondingly, for discrete signal \( \begin{aligned} f^{\prime}[x] &=f[x] * h^{\prime}[x] \\ &=\sum_{k=-\infty}^{\infty} f[x-k] h^{\prime}[k] \end{aligned} \)  The ideal interpolater is sinc function which can interpolate the original signal from the discrete signal perfectly. However it can not be realized. Iin practice, another filter is always considered as the alternative of the sic function: Gauss filter. 

The common form of Gaus filter is:

 \( \begin{aligned} \frac{d}{d x} I(x, y) & \approx I[x, y] *\left(\frac{d}{d x} h(x, y)\right) \\ &=\sum_{k=-\infty}^{\infty} \sum_{l=-\infty}^{\infty} I[k, l] g^{\prime}(x-k) g(y-l) \end{aligned} \) 

The Gaus filters can be used in an image separatly in two directions. When \( \sigma=\sqrt \frac{1}{2log2} \) and round all of the weight we get the Sobel filter as in figure 3. It is as same as the first interpretation.

Figure 3 Sobel filter in the X direction

Feture detection

There are three different features: edges, corners, and flats. The best features are the corners. Generally, for the human being, we can not easily match a block, which only contains some flat information of an image. Theoretically, the intensities of a corner will be changed in all direction.But they will not be changed for a flat. So the extraction of features is  detection of corners and edges. Harris detector uses the moving block to determine corners and edges. \( E(u,v) = \sum_{x,y} W(x,y) \times (I(x+u,y+v)-I(x,y))^2 \) \(U\) and \(V\) donate the displacements of the block in x and y directions. The computer can recognize the corners, by calculating the accumulation of the different intensities of a block. After using the Taylor expand, the formula can be simplified as \( E(u, v) \approx[u, v] M\left[\begin{array}{l} u \\ v \end{array}\right] \) and \( M=\sum_{(x, y)} w(x, y)\left[\begin{array}{cc} I_{x}^{2} & I_{x} I_{y} \\ I_{x} I_{y} & I_{y}^{2} \end{array}\right] \)

Because the eigenvector and eigenvalue can be interpreted as an ellipse as in figure 4, the M matrix can be decomposed into \( UVU^\mathrm{T} \) where \( U \) contains the eigenvectors and \(V\) is a diagonal matrix with eigenvalues in diagoal.  The eigenvalues indicate whether the intensities of the corresponding direction have sharply changed. 

 

Figure 4 ellipse of eigenvetor

The score function 

\( H = (1-2k)\lambda_1 \lambda_2-k(\lambda_1^2+\lambda_2^2) \) \

We can determine if the block contains a corner, edge or flat. \( \begin{equation} Block= \begin{cases} corner& \text{H> Threshold}\\ edge& \text{H< Threshold}\\ flat&\text{M has low score} \end{cases} \end{equation} \)


Q&A
1. Why we should apply the Sobel filter in x and y-direction? 
   One intuitive answer is that we can not keep vertical edge information if we apply the Sobel filter in the horizontal direction. and we can not keep the horizontal edge information if we only apply Sobel filter in the vertical direction. One scenario is in figure 5. 

Figure 5 image with the vertical and horizontal edge

if we apply the Sobel filter in two directions in figure 6, we can easily find the differences. 

Figure 6 Lena edge information

2. Why the center point has the largest weight in the Gauss filter?

   the Gauss filter is an average filter contains different weights. The intensity of the center point is the average value of the gauss block. So the points that are closer to the center point has more relationship with the center point. They have larger weights and the further points which have less relationship with the center point have lower weights. 

posted @ 2020-05-04 04:30  brass  阅读(237)  评论(0编辑  收藏  举报