背景介绍

图像拼接是一项应用广泛的图像处理技术。根据特征点的相互匹配,可以将多张小视角的图像拼接成为一张大视角的图像,在广角照片合成、卫星照片处理、医学图像处理等领域都有应用。早期的图像拼接主要是运用像素值匹配的方法。后来,人们分别在两幅图像中寻找拐点、边缘等稳定的特征,用特征匹配的方法拼接图像。本实验根据Matthew Brown (2005) 描述的方法,实现多张生活照的拼接。

 

特征点捕捉 (Interest Point Detection)

首先,拍摄两张场景有重合的照片。为了保证有足够多的公共特征点,照片的重合度应该保证在30%以上。将两张照片转换为灰度图像,对图像做σ=1的高斯模糊。在Matthew的文章中,他建立了一个图像金字塔,在不同尺度寻找Harris关键点。考虑到将要拼接的照片视野尺寸接近,故简化此步骤,仅在原图提取特征点。

接下来用sobel算子计算图像在x、y两个方向亮度的梯度,用σ=1.5的高斯函数对梯度做平滑处理,减小噪点对亮度的影响。很容易发现,若我们求一小块区域内亮度的累加值,在图像变化平缓的区域上下左右移动窗口累加值的变化并不明显;在物体的边缘,沿着边缘方向的变化也不明显;而在关键点附近,轻微的移动窗口都会强烈改变亮度的累加值,如图1所示。

 

图1 http://www.cse.psu.edu/~rcollins/CSE486/lecture06.pdf

亮度的变化值可以用下面的公式计算得到:

        (1)

其中,w(x, y) 是高斯函数的权重,I(x, y)是该点亮度的梯度。

在计算时,上面的公式又可以近似为如下:

        (2)

通过比较矩阵的特征值l1和l2,我们可以判断该点所处的状态。若l1>>l2或者l2<<l1,表示该点位于纵向或者横向的边缘;若l1和l2近似且值很小,表示该点位于平滑区域;若l1和l2近似但值很大,表示该点位于关键点。根据Harris and Stephens (1988) 的介绍,我们并不需要直接计算两个特征值,用R = Det(H)/Tr(H)2的值就可以反映两个特征值的比值,这样可以减少运算量。我们保留R > 2的点。除此之外,每个点的R和周围8邻域像素的R值比较,仅保留局部R值最大的点。最后,去除图片边界附近的关键点。

至此,我们在两幅图片分别得到了一组关键点,如图2所示。

 

图2 Harris Corner

 

自适应非极大值抑制 (Adaptive Non-Maximal Suppression)

由于上一步得到的关键点很多,直接计算会导致很大的运算量,也会增加误差。接下去就要去除其中绝大部分的关键点,仅保留一些特征明显点,且让关键点在整幅图像内分布均匀。Matthew发明了adaptive non-maximal suppression (ANMS) 方法来择优选取特定数量的关键点。

ANMS的思想是有一个半径r,初始值为无限远。当r不断减小时,保留在半径r以内其它关键点R值均小于中心点R值的关键点,将其加入队列。队列内的关键点数达到预设值后停止搜索。

 

Xi是上一步得到的关键点的2维坐标,G是所有关键点的集合,c=0.9。

实际计算时,我们将上述过程相反。这里我设定每幅图像各提取500个关键点。首先找出整幅图片R值最大的关键点Rmax,加入队列,并且得到Rmax*0.9的值。遍历所有关键点,若该关键点xi的Ri> Rmax*0.9, 该点的半径设为无限远;若该关键点xi的Ri< Rmax*0.9,计算该点到离它最近的Rj>0.9R的点xi,记录两点间的距离ri。最后将所有r排序,找出r最大的500个点,如图3所示。

 

图3 Harris corner after ANMS

 

关键点的描述 (Feature Descriptor)

关键点的描述方法有很多种,包括局部梯度描述、尺度不变特征变换 (SIFT、SUFT) 等等。因为生活照的旋转角度通常不超过15°,所以这里不考虑关键点的旋转不变性。

对图像做适度的高斯模糊,以关键点为中心,取40x40像素的区域。将该区域降采样至8x8的大小,生成一个64维的向量。对向量做归一化处理。每个关键点都用一个64维的向量表示,于是每幅图像分别得到了一个500x64的特征矩阵。

 

关键点的匹配

首先,从两幅图片的500个特征点中筛选出配对的点。筛选的方法是先计算500个特征点两两之间的欧氏距离,按照距离由小到大排序。通常情况下选择距离最小的一对特征向量配对。Lowe(2004)认为,仅仅观察最小距离并不能有效筛选配对特征点,而用最小的距离和第二小的距离的比值可以很好的进行筛选。如图4所示, 使用距离的比值能够获得更高的true positive, 同时控制较低的false positive。我使用的阈值是r1/r2<0.5。经过筛选后的配对特征点 如图5所示

 

图 4. 配对正确率和配对方法、阈值选择的关系

 

图 5. 筛选后的配对特征点

关键点的匹配使用Random Sample Consensus (RANSAC) 算法。以一幅图像为基准,每次从中随机选择8个点,在另一幅图像中找出配对的8个点。用8对点计算得到一个homography,将基准图中剩余的特征点按照homography变换投影到另一幅图像,统计配对点的个数。

重复上述步骤2000次,得到准确配对最多的一个homography。至此,两幅图像的投影变换关系已经找到。

 

新图像的合成

在做图像投影前,要先新建一个空白画布。比较投影后两幅图像的2维坐标的上下左右边界,选取各个方向边界的最大值作为新图像的尺寸。同时,计算得到两幅图像的交叉区域。

在两幅图像的交叉区域,按照cross dissolve的方法制作两块如图6所示的蒙版,3个通道的像素值再次区间内递减(递升)。

 

效果展示

下面展示几张照片拼接的效果图。

 

图 7. 拼接完成的新图像

 

图 8. 以左边照片为基准拼接

 

 

 

附Matlab代码:

function [output_image] = image_stitching(input_A, input_B)
% -------------------------------------------------------------------------
% 1. Load both images, convert to double and to grayscale.
% 2. Detect feature points in both images.
% 3. Extract fixed-size patches around every keypoint in both images, and
% form descriptors simply by "flattening" the pixel values in each patch to
% one-dimensional vectors.
% 4. Compute distances between every descriptor in one image and every descriptor in the other image.
% 5. Select putative matches based on the matrix of pairwise descriptor
% distances obtained above.
% 6. Run RANSAC to estimate (1) an affine transformation and (2) a
% homography mapping one image onto the other.
% 7. Warp one image onto the other using the estimated transformation.
% 8. Create a new image big enough to hold the panorama and composite the
% two images into it.
%
% Input:
% input_A - filename of warped image
% input_B - filename of unwarped image
% Output:
% output_image - combined new image
%
% Reference:
% [1] C.G. Harris and M.J. Stephens, A combined corner and edge detector, 1988.
% [2] Matthew Brown, Multi-Image Matching using Multi-Scale Oriented Patches.
%
% zhyh8341@gmail.com

% -------------------------------------------------------------------------

% READ IMAGE, GET SIZE INFORMATION
image_A = imread(input_A);
image_B = imread(input_B);
[height_wrap, width_wrap,~] = size(image_A);
[height_unwrap, width_unwrap,~] = size(image_B);

% CONVERT TO GRAY SCALE
gray_A = im2double(rgb2gray(image_A));
gray_B = im2double(rgb2gray(image_B));


% FIND HARRIS CORNERS IN BOTH IMAGE
[x_A, y_A, v_A] = harris(gray_A, 2, 0.0, 2);
[x_B, y_B, v_B] = harris(gray_B, 2, 0.0, 2);

% ADAPTIVE NON-MAXIMAL SUPPRESSION (ANMS)
ncorners = 500;
[x_A, y_A, ~] = ada_nonmax_suppression(x_A, y_A, v_A, ncorners);
[x_B, y_B, ~] = ada_nonmax_suppression(x_B, y_B, v_B, ncorners);

% EXTRACT FEATURE DESCRIPTORS
sigma = 7;
[des_A] = getFeatureDescriptor(gray_A, x_A, y_A, sigma);
[des_B] = getFeatureDescriptor(gray_B, x_B, y_B, sigma);

% IMPLEMENT FEATURE MATCHING
dist = dist2(des_A,des_B);
[ord_dist, index] = sort(dist, 2);
% THE RATIO OF FIRST AND SECOND DISTANCE IS A BETTER CRETIA THAN DIRECTLY
% USING THE DISTANCE. RATIO LESS THAN .5 GIVES AN ACCEPTABLE ERROR RATE.
ratio = ord_dist(:,1)./ord_dist(:,2);
threshold = 0.5;
idx = ratio<threshold;

x_A = x_A(idx);
y_A = y_A(idx);
x_B = x_B(index(idx,1));
y_B = y_B(index(idx,1));
npoints = length(x_A);


% USE 4-POINT RANSAC TO COMPUTE A ROBUST HOMOGRAPHY ESTIMATE
% KEEP THE FIRST IMAGE UNWARPED, WARP THE SECOND TO THE FIRST
matcher_A = [y_A, x_A, ones(npoints,1)]'; %!!! previous x is y and y is x,
matcher_B = [y_B, x_B, ones(npoints,1)]'; %!!! so switch x and y here.
[hh, ~] = ransacfithomography(matcher_B, matcher_A, npoints, 10);

% s = load('matcher.mat');
% matcher_A = s.matcher(1:3,:);
% matcher_B = s.matcher(4:6,:);
% npoints = 60;
% [hh, inliers] = ransacfithomography(matcher_B, matcher_A, npoints, 10);


% USE INVERSE WARP METHOD
% DETERMINE THE SIZE OF THE WHOLE IMAGE
[newH, newW, newX, newY, xB, yB] = getNewSize(hh, height_wrap, width_wrap, height_unwrap, width_unwrap);

[X,Y] = meshgrid(1:width_wrap,1:height_wrap);
[XX,YY] = meshgrid(newX:newX+newW-1, newY:newY+newH-1);
AA = ones(3,newH*newW);
AA(1,:) = reshape(XX,1,newH*newW);
AA(2,:) = reshape(YY,1,newH*newW);

AA = hh*AA;
XX = reshape(AA(1,:)./AA(3,:), newH, newW);
YY = reshape(AA(2,:)./AA(3,:), newH, newW);

% INTERPOLATION, WARP IMAGE A INTO NEW IMAGE
newImage(:,:,1) = interp2(X, Y, double(image_A(:,:,1)), XX, YY);
newImage(:,:,2) = interp2(X, Y, double(image_A(:,:,2)), XX, YY);
newImage(:,:,3) = interp2(X, Y, double(image_A(:,:,3)), XX, YY);

% BLEND IMAGE BY CROSS DISSOLVE
[newImage] = blend(newImage, image_B, xB, yB);

% DISPLAY IMAGE MOSIAC
imshow(uint8(newImage));

 

% -------------------------------------------------------------------------
% ------------------------------- other functions -------------------------
% -------------------------------------------------------------------------
function [xp, yp, value] = harris(input_image, sigma,thd, r)
% Detect harris corner
% Input:
% sigma - standard deviation of smoothing Gaussian
% r - radius of region considered in non-maximal suppression
% Output:
% xp - x coordinates of harris corner points
% yp - y coordinates of harris corner points
% value - values of R at harris corner points

% CONVERT RGB IMAGE TO GRAY-SCALE, AND BLUR WITH G1 KERNEL
g1 = fspecial('gaussian', 7, 1);
gray_image = imfilter(input_image, g1);

% FILTER INPUT IMAGE WITH SOBEL KERNEL TO GET GRADIENT ON X AND Y
% ORIENTATION RESPECTIVELY
h = fspecial('sobel');
Ix = imfilter(gray_image,h,'replicate','same');
Iy = imfilter(gray_image,h','replicate','same');

% GENERATE GAUSSIAN FILTER OF SIZE 6*SIGMA (± 3SIGMA) AND OF MINIMUM SIZE 1x1
g = fspecial('gaussian',fix(6*sigma), sigma);

Ix2 = imfilter(Ix.^2, g, 'same').*(sigma^2);
Iy2 = imfilter(Iy.^2, g, 'same').*(sigma^2);
Ixy = imfilter(Ix.*Iy, g, 'same').*(sigma^2);

% HARRIS CORNER MEASURE
R = (Ix2.*Iy2 - Ixy.^2)./(Ix2 + Iy2 + eps);
% ANOTHER MEASUREMENT, USUALLY k IS BETWEEN 0.04 ~ 0.06
% response = (Ix2.*Iy2 - Ixy.^2) - k*(Ix2 + Iy2).^2;

% GET RID OF CORNERS WHICH IS CLOSE TO BORDER
R([1:20, end-20:end], :) = 0;
R(:,[1:20,end-20:end]) = 0;

% SUPRESS NON-MAX
d = 2*r+1;
localmax = ordfilt2(R,d^2,true(d));
R = R.*(and(R==localmax, R>thd));

% RETURN X AND Y COORDINATES
[xp,yp,value] = find(R);

function [newx, newy, newvalue] = ada_nonmax_suppression(xp, yp, value, n)
% Adaptive non-maximun suppression
% For each Harris Corner point, the minimum suppression radius is the
% minimum distance from that point to a different point with a higher
% corner strength.
% Input:
% xp,yp - coordinates of harris corner points
% value - strength of suppression
% n - number of interesting points
% Output:
% newx, newy - new x and y coordinates after adaptive non-maximun suppression
% value - strength of suppression after adaptive non-maximun suppression

% ALLOCATE MEMORY
% newx = zeros(n,1);
% newy = zeros(n,1);
% newvalue = zeros(n,1);

if(length(xp) < n)
newx = xp;
newy = yp;
newvalue = value;
return;
end

radius = zeros(n,1);
c = .9;
maxvalue = max(value)*c;
for i=1:length(xp)
if(value(i)>maxvalue)
radius(i) = 99999999;
continue;
else
dist = (xp-xp(i)).^2 + (yp-yp(i)).^2;
dist((value*c) < value(i)) = [];
radius(i) = sqrt(min(dist));
end
end

[~, index] = sort(radius,'descend');
index = index(1:n);

newx = xp(index);
newy = yp(index);
newvalue = value(index);

function n2 = dist2(x, c)
% DIST2 Calculates squared distance between two sets of points.
% Adapted from Netlab neural network software:
% http://www.ncrg.aston.ac.uk/netlab/index.php
%
% Description
% D = DIST2(X, C) takes two matrices of vectors and calculates the
% squared Euclidean distance between them. Both matrices must be of
% the same column dimension. If X has M rows and N columns, and C has
% L rows and N columns, then the result has M rows and L columns. The
% I, Jth entry is the squared distance from the Ith row of X to the
% Jth row of C.
%
%
% Copyright (c) Ian T Nabney (1996-2001)

[ndata, dimx] = size(x);
[ncentres, dimc] = size(c);
if dimx ~= dimc
error('Data dimension does not match dimension of centres')
end

n2 = (ones(ncentres, 1) * sum((x.^2)', 1))' + ...
ones(ndata, 1) * sum((c.^2)',1) - ...
2.*(x*(c'));

% Rounding errors occasionally cause negative entries in n2
if any(any(n2<0))
n2(n2<0) = 0;
end

function [descriptors] = getFeatureDescriptor(input_image, xp, yp, sigma)
% Extract non-rotation invariant feature descriptors
% Input:
% input_image - input gray-scale image
% xx - x coordinates of potential feature points
% yy - y coordinates of potential feature points
% output:
% descriptors - array of descriptors

% FIRST BLUR WITH GAUSSIAN KERNEL
g = fspecial('gaussian', 5, sigma);
blurred_image = imfilter(input_image, g, 'replicate','same');

% THEN TAKE A 40x40 PIXEL WINDOW AND DOWNSAMPLE TO 8x8 PATCH
npoints = length(xp);
descriptors = zeros(npoints,64);

for i = 1:npoints
%pA = imresize( blurred_image(xp(i)-20:xp(i)+19, yp(i)-20:yp(i)+19), .2);
patch = blurred_image(xp(i)-20:xp(i)+19, yp(i)-20:yp(i)+19);
patch = imresize(patch, .2);
descriptors(i,:) = reshape((patch - mean2(patch))./std2(patch), 1, 64);
end

function [hh] = getHomographyMatrix(point_ref, point_src, npoints)
% Use corresponding points in both images to recover the parameters of the transformation
% Input:
% x_ref, x_src --- x coordinates of point correspondences
% y_ref, y_src --- y coordinates of point correspondences
% Output:
% h --- matrix of transformation

% NUMBER OF POINT CORRESPONDENCES
x_ref = point_ref(1,:)';
y_ref = point_ref(2,:)';
x_src = point_src(1,:)';
y_src = point_src(2,:)';

% COEFFICIENTS ON THE RIGHT SIDE OF LINEAR EQUATIONS
A = zeros(npoints*2,8);
A(1:2:end,1:3) = [x_ref, y_ref, ones(npoints,1)];
A(2:2:end,4:6) = [x_ref, y_ref, ones(npoints,1)];
A(1:2:end,7:8) = [-x_ref.*x_src, -y_ref.*x_src];
A(2:2:end,7:8) = [-x_ref.*y_src, -y_ref.*y_src];

% COEFFICIENT ON THE LEFT SIDE OF LINEAR EQUATIONS
B = [x_src, y_src];
B = reshape(B',npoints*2,1);

% SOLVE LINEAR EQUATIONS
h = A\B;

hh = [h(1),h(2),h(3);h(4),h(5),h(6);h(7),h(8),1];

function [hh, inliers] = ransacfithomography(ref_P, dst_P, npoints, threshold);
% 4-point RANSAC fitting
% Input:
% matcher_A - match points from image A, a matrix of 3xN, the third row is 1
% matcher_B - match points from image B, a matrix of 3xN, the third row is 1
% thd - distance threshold
% npoints - number of samples
%
% 1. Randomly select minimal subset of points
% 2. Hypothesize a model
% 3. Computer error function
% 4. Select points consistent with model
% 5. Repeat hypothesize-and-verify loop
%
% Yihua Zhao 02-01-2014
% zhyh8341@gmail.com

ninlier = 0;
fpoints = 8; %number of fitting points
for i=1:2000
rd = randi([1 npoints],1,fpoints);
pR = ref_P(:,rd);
pD = dst_P(:,rd);
h = getHomographyMatrix(pR,pD,fpoints);
rref_P = h*ref_P;
rref_P(1,:) = rref_P(1,:)./rref_P(3,:);
rref_P(2,:) = rref_P(2,:)./rref_P(3,:);
error = (rref_P(1,:) - dst_P(1,:)).^2 + (rref_P(2,:) - dst_P(2,:)).^2;
n = nnz(error<threshold);
if(n >= npoints*.95)
hh=h;
inliers = find(error<threshold);
pause();
break;
elseif(n>ninlier)
ninlier = n;
hh=h;
inliers = find(error<threshold);
end
end

function [newH, newW, x1, y1, x2, y2] = getNewSize(transform, h2, w2, h1, w1)
% Calculate the size of new mosaic
% Input:
% transform - homography matrix
% h1 - height of the unwarped image
% w1 - width of the unwarped image
% h2 - height of the warped image
% w2 - height of the warped image
% Output:
% newH - height of the new image
% newW - width of the new image
% x1 - x coordate of lefttop corner of new image
% y1 - y coordate of lefttop corner of new image
% x2 - x coordate of lefttop corner of unwarped image
% y2 - y coordate of lefttop corner of unwarped image
%
% Yihua Zhao 02-02-2014
% zhyh8341@gmail.com
%

% CREATE MESH-GRID FOR THE WARPED IMAGE
[X,Y] = meshgrid(1:w2,1:h2);
AA = ones(3,h2*w2);
AA(1,:) = reshape(X,1,h2*w2);
AA(2,:) = reshape(Y,1,h2*w2);

% DETERMINE THE FOUR CORNER OF NEW IMAGE
newAA = transform\AA;
new_left = fix(min([1,min(newAA(1,:)./newAA(3,:))]));
new_right = fix(max([w1,max(newAA(1,:)./newAA(3,:))]));
new_top = fix(min([1,min(newAA(2,:)./newAA(3,:))]));
new_bottom = fix(max([h1,max(newAA(2,:)./newAA(3,:))]));

newH = new_bottom - new_top + 1;
newW = new_right - new_left + 1;
x1 = new_left;
y1 = new_top;
x2 = 2 - new_left;
y2 = 2 - new_top;

function [newImage] = blend(warped_image, unwarped_image, x, y)
% Blend two image by using cross dissolve
% Input:
% warped_image - original image
% unwarped_image - the other image
% x - x coordinate of the lefttop corner of unwarped image
% y - y coordinate of the lefttop corner of unwarped image
% Output:
% newImage
%
% Yihua Zhao 02-02-2014
% zhyh8341@gmail.com
%


% MAKE MASKS FOR BOTH IMAGES
warped_image(isnan(warped_image))=0;
maskA = (warped_image(:,:,1)>0 |warped_image(:,:,2)>0 | warped_image(:,:,3)>0);
newImage = zeros(size(warped_image));
newImage(y:y+size(unwarped_image,1)-1, x: x+size(unwarped_image,2)-1,:) = unwarped_image;
mask = (newImage(:,:,1)>0 | newImage(:,:,2)>0 | newImage(:,:,3)>0);
mask = and(maskA, mask);

% GET THE OVERLAID REGION
[~,col] = find(mask);
left = min(col);
right = max(col);
mask = ones(size(mask));
if( x<2)
mask(:,left:right) = repmat(linspace(0,1,right-left+1),size(mask,1),1);
else
mask(:,left:right) = repmat(linspace(1,0,right-left+1),size(mask,1),1);
end

% BLEND EACH CHANNEL
warped_image(:,:,1) = warped_image(:,:,1).*mask;
warped_image(:,:,2) = warped_image(:,:,2).*mask;
warped_image(:,:,3) = warped_image(:,:,3).*mask;

% REVERSE THE ALPHA VALUE
if( x<2)
mask(:,left:right) = repmat(linspace(1,0,right-left+1),size(mask,1),1);
else
mask(:,left:right) = repmat(linspace(0,1,right-left+1),size(mask,1),1);
end
newImage(:,:,1) = newImage(:,:,1).*mask;
newImage(:,:,2) = newImage(:,:,2).*mask;
newImage(:,:,3) = newImage(:,:,3).*mask;

newImage(:,:,1) = warped_image(:,:,1) + newImage(:,:,1);
newImage(:,:,2) = warped_image(:,:,2) + newImage(:,:,2);
newImage(:,:,3) = warped_image(:,:,3) + newImage(:,:,3);

 

若觉得以上文字和代码有帮助,请给我一些鼓励吧!