OpenCV学习笔记（27）KAZE 算法原理与源码分析（一）非线性扩散滤波

KAZE算法资源：

1. 论文： http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf

2. 项目主页：http://www.robesafe.com/personal/pablo.alcantarilla/kaze.html

3. 作者代码：http://www.robesafe.com/personal/pablo.alcantarilla/code/kaze_features_1_4.tar
（需要boost库，另外其计时函数的使用比较复杂，可以用OpenCV的cv::getTickCount代替）

4. Computer Vision Talks的评测：http://computer-vision-talks.com/2013/03/porting-kaze-features-to-opencv/

5. Computer Vision Talks 博主Ievgen Khvedchenia将KAZE集成到OpenCV的cv::Feature2D类，但需要重新编译OpenCV，并且没有实现算法参数调整和按Mask过滤特征点的功能：https://github.com/BloodAxe/opencv/tree/kaze-features

6. 我在Ievgen的项目库中提取出KAZE，封装成继承cv::Feature2D的类，无需重新编译OpenCV，实现了参数调整和Mask过滤的功能: https://github.com/yuhuazou/kaze_opencv

7. Matlab 版的接口程序，封装了1.0版的KAZE代码：https://github.com/vlfeat/vlbenchmarks/blob/unstable/%2BlocalFeatures/Kaze.m

简介

ECCV2012中出现了一种比SIFT更稳定的特征检测算法KAZE （[1]）。KAZE的取名是为了纪念尺度空间分析的开创者—日本学者Iijima。KAZE是日语‘风’的谐音，寓意是就像风的形成是空气在空间中非线性的流动过程一样，KAZE特征检测是在图像域中进行非线性扩散处理的过程。

传统的SIFT、SURF等特征检测算法都是基于线性的高斯金字塔进行多尺度分解来消除噪声和提取显著特征点。但高斯分解是牺牲了局部精度为代价的，容易造成边界模糊和细节丢失。非线性的尺度分解有望解决这种问题，但传统方法基于正向欧拉法（forward Euler scheme）求解非线性扩散（Non-linear diffusion）方程时迭代收敛的步长太短，耗时长、计算复杂度高。由此，KAZE算法的作者提出采用加性算子分裂算法(Additive Operator Splitting, AOS)来进行非线性扩散滤波，可以采用任意步长来构造稳定的非线性尺度空间。

注：KAZE算法的原理与SIFT和SURF有很多相似之处，在深入了解KAZE之前，可以参考以下的博客文章对SIFT和SURF的介绍：

1. 【OpenCV】SIFT原理与源码分析 - 小魏的修行路（很不错的博客，用心、认真、图文并茂）

2. Surf算法学习心得(一)——算法原理 - yang_yong’ blog

3. Opencv学习笔记（六）SURF学习笔记 - crazy_sparrow的专栏

1.1 非线性扩散滤波

1.1.1 Perona-Malik扩散方程

具体地，非线性扩散滤波方法是将图像亮度（L）在不同尺度上的变化视为某种形式的流动函数（flow function）的散度（divergence），可以通过非线性偏微分方程来描述：

通过设置合适的传导函数 c(x,y,t) ，可以使得扩散自适应于图像的局部结构。传导函数可以是标量、也可以是张量。时间t作为尺度参数，其值越大、则图像的表示形式越简单。Perona和Malik提出了传导函数的构造方式：

其中的▽L_σ是高斯平滑后的图像L_σ的梯度（gradient）。函数 g() 的形式有如下几种：

其中函数g₁优先保留高对比度的边缘，g₂优先保留宽度较大的区域，g₃能够有效平滑区域内部而保留边界信息（KAZE代码中默认采用函数g₂）。

函数g₁、g₂、g₃的实现代码如下（在文件 kaze_nldiffusion_functions.cpp 中）：

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This function computes the Perona and Malik conductivity coefficient g1
 * g1 = exp(-|dL|^2/k^2)
 * @param src Input image
 * @param dst Output image
 * @param Lx First order image derivative in X-direction (horizontal)
 * @param Ly First order image derivative in Y-direction (vertical)
 * @param k Contrast factor parameter
 */
void PM_G1(const cv::Mat &src, cv::Mat &dst, cv::Mat &Lx, cv::Mat &Ly, float k )
{
    cv::exp(-(Lx.mul(Lx) + Ly.mul(Ly))/(k*k),dst);
}

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This function computes the Perona and Malik conductivity coefficient g2
 * g2 = 1 / (1 + dL^2 / k^2)
 * @param src Input image
 * @param dst Output image
 * @param Lx First order image derivative in X-direction (horizontal)
 * @param Ly First order image derivative in Y-direction (vertical)
 * @param k Contrast factor parameter
 */
void PM_G2(const cv::Mat &src, cv::Mat &dst, cv::Mat &Lx, cv::Mat &Ly, float k )
{
    dst = 1./(1. + (Lx.mul(Lx) + Ly.mul(Ly))/(k*k));
}

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This function computes Weickert conductivity coefficient g3
 * @param src Input image
 * @param dst Output image
 * @param Lx First order image derivative in X-direction (horizontal)
 * @param Ly First order image derivative in Y-direction (vertical)
 * @param k Contrast factor parameter
 * @note For more information check the following paper: J. Weickert
 * Applications of nonlinear diffusion in image processing and computer vision,
 * Proceedings of Algorithmy 2000
 */
void Weickert_Diffusivity(const cv::Mat &src, cv::Mat &dst, cv::Mat &Lx, cv::Mat &Ly, float k )
{
    cv::Mat modg;
    cv::pow((Lx.mul(Lx) + Ly.mul(Ly))/(k*k),4,modg);
    cv::exp(-3.315/modg, dst);
    dst = 1.0 - dst;
}

参数k是控制扩散级别的对比度因子（contrast factor），能够决定保留多少边缘信息，其值越大，保留的边缘信息越少。在KAZE算法中，参数k的取值是梯度图像▽L_σ的直方图70% 百分位上的值：

计算参数k的实现源码如下（在文件 kaze_nldiffusion_functions.cpp 中）：

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This function computes a good empirical value for the k contrast factor
 * given an input image, the percentile (0-1), the gradient scale and the number of
 * bins in the histogram
 * @param img Input image
 * @param perc Percentile of the image gradient histogram (0-1)
 * @param gscale Scale for computing the image gradient histogram
 * @param nbins Number of histogram bins
 * @param ksize_x Kernel size in X-direction (horizontal) for the Gaussian smoothing kernel
 * @param ksize_y Kernel size in Y-direction (vertical) for the Gaussian smoothing kernel
 * @return k contrast factor
 */
float Compute_K_Percentile(const cv::Mat &img, float perc, float gscale, unsigned int nbins, unsigned int ksize_x, unsigned int ksize_y)
{
	float kperc = 0.0, modg = 0.0, lx = 0.0, ly = 0.0;
	unsigned int nbin = 0, nelements = 0, nthreshold = 0, k = 0;
    float npoints = 0.0;	// number of points of which gradient greater than zero
	float hmax = 0.0;		// maximum gradient

    // Create the array for the histogram
    float *hist = new float[nbins];

	// Create the matrices
	cv::Mat gaussian = cv::Mat::zeros(img.rows,img.cols,CV_32F);
	cv::Mat Lx = cv::Mat::zeros(img.rows,img.cols,CV_32F);
	cv::Mat Ly = cv::Mat::zeros(img.rows,img.cols,CV_32F);
	
	// Set the histogram to zero, just in case
	for( unsigned int i = 0; i < nbins; i++ )
	{
		hist[i] = 0.0;
	}

	// Perform the Gaussian convolution
	Gaussian_2D_Convolution(img,gaussian,ksize_x,ksize_y,gscale);
			
	// Compute the Gaussian derivatives Lx and Ly
    Image_Derivatives_Scharr(gaussian,Lx,1,0);
    Image_Derivatives_Scharr(gaussian,Ly,0,1);
	
	// Skip the borders for computing the histogram
	for( int i = 1; i < gaussian.rows-1; i++ )
	{
		for( int j = 1; j < gaussian.cols-1; j++ )
		{
            lx = *(Lx.ptr<float>(i)+j);
            ly = *(Ly.ptr<float>(i)+j);
			modg = sqrt(lx*lx + ly*ly);
	
			// Get the maximum
			if( modg > hmax )
			{
				hmax = modg;
			}
		}
	}

	// Skip the borders for computing the histogram
	for( int i = 1; i < gaussian.rows-1; i++ )
	{
		for( int j = 1; j < gaussian.cols-1; j++ )
		{
            lx = *(Lx.ptr<float>(i)+j);
            ly = *(Ly.ptr<float>(i)+j);
			modg = sqrt(lx*lx + ly*ly);		// modg can be stored in a matrix in last step in oder to avoid re-computation

			// Find the correspondent bin
			if( modg != 0.0 )
			{
				nbin = floor(nbins*(modg/hmax));

                if( nbin == nbins )
                {
                    nbin--;
                }

				hist[nbin]++;
				npoints++;
			}
		}
	}
	
	// Now find the perc of the histogram percentile
	nthreshold = (unsigned int)(npoints*perc);
	
	// find the bin (k) in which accumulated points are greater than 70% (perc) of total valid points (npoints)
	for( k = 0; nelements < nthreshold && k < nbins; k++)
	{
		nelements = nelements + hist[k];
	}
	
	if( nelements < nthreshold )
	{
		kperc = 0.03;
	}
	else
	{
		kperc = hmax*((float)(k)/(float)nbins);	
	}
	
    delete hist;
	return kperc;
}

注：有关非线性扩散滤波的应用，参见[2]。

1.1.2 AOS算法

由于非线性偏微分方程并没有解析解，一般通过数值分析的方法进行迭代求解。传统上采用显式差分格式的求解方法只能采用小步长，收敛缓慢。为此，将方程离散化为以下的隐式差分格式：

其中A_l是表示图像在各维度（l）上传导性的矩阵。该方程的解如下：

这种求解方法对任意时间步长（τ）都有效。上式中矩阵A_l是三对角矩阵并且对角占优（tridiagonal and diagonally dominant matrix），这样的线性系统可以通过Thomas算法快速求解。（有关AOS的应用，参见[3]）

该算法的实现源码如下（在文件 kaze.cpp 中）：

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This method performs a scalar non-linear diffusion step using AOS schemes
 * @param Ld Image at a given evolution step
 * @param Ldprev Image at a previous evolution step
 * @param c Conductivity image
 * @param stepsize Stepsize for the nonlinear diffusion evolution
 * @note If c is constant, the diffusion will be linear
 * If c is a matrix of the same size as Ld, the diffusion will be nonlinear
 * The stepsize can be arbitrarilly large
*/
void KAZE::AOS_Step_Scalar(cv::Mat &Ld, const cv::Mat &Ldprev, const cv::Mat &c, const float stepsize)
{
   AOS_Rows(Ldprev,c,stepsize);
   AOS_Columns(Ldprev,c,stepsize);

   Ld = 0.5*(Lty + Ltx.t());
}

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This method performs a scalar non-linear diffusion step using AOS schemes
 * Diffusion in each dimension is computed independently in a different thread
 * @param Ld Image at a given evolution step
 * @param Ldprev Image at a previous evolution step
 * @param c Conductivity image
 * @param stepsize Stepsize for the nonlinear diffusion evolution
 * @note If c is constant, the diffusion will be linear
 * If c is a matrix of the same size as Ld, the diffusion will be nonlinear
 * The stepsize can be arbitrarilly large
*/
#if HAVE_THREADING_SUPPORT
void KAZE::AOS_Step_Scalar_Parallel(cv::Mat &Ld, const cv::Mat &Ldprev, const cv::Mat &c, const float stepsize)
{
   boost::thread *AOSth1 = new boost::thread(&KAZE::AOS_Rows,this,Ldprev,c,stepsize);
   boost::thread *AOSth2 = new boost::thread(&KAZE::AOS_Columns,this,Ldprev,c,stepsize);
   
   AOSth1->join();
   AOSth2->join();
   
   Ld = 0.5*(Lty + Ltx.t());

   delete AOSth1;
   delete AOSth2;
}
#endif

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This method performs performs 1D-AOS for the image rows
 * @param Ldprev Image at a previous evolution step
 * @param c Conductivity image
 * @param stepsize Stepsize for the nonlinear diffusion evolution
*/
void KAZE::AOS_Rows(const cv::Mat &Ldprev, const cv::Mat &c, const float stepsize)
{
   // Operate on rows
   for( int i = 0; i < qr.rows; i++ )
   {
	   for( int j = 0; j < qr.cols; j++ )
	   {
           *(qr.ptr<float>(i)+j) = *(c.ptr<float>(i)+j) + *(c.ptr<float>(i+1)+j);
	   }
   }
   
   for( int j = 0; j < py.cols; j++ )
   {
       *(py.ptr<float>(0)+j) = *(qr.ptr<float>(0)+j);
   }

   for( int j = 0; j < py.cols; j++ )
   {
       *(py.ptr<float>(py.rows-1)+j) = *(qr.ptr<float>(qr.rows-1)+j);
   }
   
   for( int i = 1; i < py.rows-1; i++ )
   {
	   for( int j = 0; j < py.cols; j++ )
	   {
           *(py.ptr<float>(i)+j) = *(qr.ptr<float>(i-1)+j) + *(qr.ptr<float>(i)+j);
	   }
   }

   // a = 1 + t.*p; (p is -1*p)
   // b = -t.*q;
   ay = 1.0 + stepsize*py; // p is -1*p
   by = -stepsize*qr;

   // Call to Thomas algorithm now
   Thomas(ay,by,Ldprev,Lty);
   
}

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This method performs performs 1D-AOS for the image columns
 * @param Ldprev Image at a previous evolution step
 * @param c Conductivity image
 * @param stepsize Stepsize for the nonlinear diffusion evolution
*/
void KAZE::AOS_Columns(const cv::Mat &Ldprev, const cv::Mat &c, const float stepsize)
{
   // Operate on columns
   for( int j = 0; j < qc.cols; j++ )
   {
	   for( int i = 0; i < qc.rows; i++ )
	   {
           *(qc.ptr<float>(i)+j) = *(c.ptr<float>(i)+j) + *(c.ptr<float>(i)+j+1);
	   }
   }

   for( int i = 0; i < px.rows; i++ )
   {
       *(px.ptr<float>(i)) = *(qc.ptr<float>(i));
   }

   for( int i = 0; i < px.rows; i++ )
   {
       *(px.ptr<float>(i)+px.cols-1) = *(qc.ptr<float>(i)+qc.cols-1);
   }
   
   for( int j = 1; j < px.cols-1; j++ )
   {
	   for( int i = 0; i < px.rows; i++ )
	   {
           *(px.ptr<float>(i)+j) = *(qc.ptr<float>(i)+j-1) + *(qc.ptr<float>(i)+j);
	   }
   }

   // a = 1 + t.*p';
   ax = 1.0 + stepsize*px.t();
   
   // b = -t.*q';
   bx = -stepsize*qc.t();
   
   // Call Thomas algorithm again
   // But take care since we need to transpose the solution!!
   Thomas(ax,bx,Ldprev.t(),Ltx);
}

//*************************************************************************************
//*************************************************************************************

/**
 * @brief This method does the Thomas algorithm for solving a tridiagonal linear system
 * @note The matrix A must be strictly diagonally dominant for a stable solution
*/
void KAZE::Thomas(cv::Mat a, cv::Mat b, cv::Mat Ld, cv::Mat x)
{
   // Auxiliary variables
   int n = a.rows;
   cv::Mat m = cv::Mat::zeros(a.rows,a.cols,CV_32F);
   cv::Mat l = cv::Mat::zeros(b.rows,b.cols,CV_32F);
   cv::Mat y = cv::Mat::zeros(Ld.rows,Ld.cols,CV_32F);

   /** A*x = d;																		   	   */
   /**	/ a1 b1  0  0 0  ...    0 \  / x1 \ = / d1 \										   */
   /**	| c1 a2 b2  0 0  ...    0 |  | x2 | = | d2 |										   */
   /**	|  0 c2 a3 b3 0  ...    0 |  | x3 | = | d3 |										   */
   /**	|  :  :  :  : 0  ...    0 |  |  : | = |  : |										   */
   /**	|  :  :  :  : 0  cn-1  an |  | xn | = | dn |										   */

   /** 1. LU decomposition
   / L = / 1				 \		U = / m1 r1			   \
   /     | l1 1 			 |	        |    m2 r2		   |
   /     |    l2 1          |			|		m3 r3	   |
   /	  |     : : :        |			|       :  :  :	   |
   /	  \           ln-1 1 /			\				mn /	*/

   for( int j = 0; j < m.cols; j++ )
   {
       *(m.ptr<float>(0)+j) = *(a.ptr<float>(0)+j);
   }
   
   for( int j = 0; j < y.cols; j++ )
   {
       *(y.ptr<float>(0)+j) = *(Ld.ptr<float>(0)+j);
   }
   
   // 2. Forward substitution L*y = d for y
   for( int k = 1; k < n; k++ )
   {
	   for( int j=0; j < l.cols; j++ )
	   {
           *(l.ptr<float>(k-1)+j) = *(b.ptr<float>(k-1)+j) / *(m.ptr<float>(k-1)+j);
	   }
	   
	   for( int j=0; j < m.cols; j++ )
	   {
           *(m.ptr<float>(k)+j) = *(a.ptr<float>(k)+j) - *(l.ptr<float>(k-1)+j)*(*(b.ptr<float>(k-1)+j));
	   }
	   
	   for( int j=0; j < y.cols; j++ )
	   {
           *(y.ptr<float>(k)+j) = *(Ld.ptr<float>(k)+j) - *(l.ptr<float>(k-1)+j)*(*(y.ptr<float>(k-1)+j));
	   }
   }
   
   // 3. Backward substitution U*x = y
   for( int j=0; j < y.cols; j++ )
   {
       *(x.ptr<float>(n-1)+j) = (*(y.ptr<float>(n-1)+j))/(*(m.ptr<float>(n-1)+j));
   }
   
   for( int i = n-2; i >= 0; i-- )
   {
	   for( int j = 0; j < x.cols; j++ )
	   {
           *(x.ptr<float>(i)+j) = (*(y.ptr<float>(i)+j) - (*(b.ptr<float>(i)+j))*(*(x.ptr<float>(i+1)+j)))/(*(m.ptr<float>(i)+j));
	   }
   }
}

上面介绍了非线性扩散滤波和AOS求解隐性差分方程的原理，是KAZE算法求解非线性尺度空间的基础，下一节我们将介绍KAZE算法的非线性尺度空间构建、特征检测与描述等内容。

待续...

Ref：

[1] http://www.robesafe.com/personal/pablo.alcantarilla/papers/Alcantarilla12eccv.pdf

[2] http://manu16.magtech.com.cn/geoprog/CN/article/downloadArticleFile.do?attachType=PDF&id=3146

[3] http://file.lw23.com/8/8e/8ec/8ecd21e4-b030-4e05-9333-40cc2d97bde4.pdf