OpenCV每日函数对象追踪模块使用增强相关系数 (ECC) 最大化的图像配准

发布时间：2022-12-10 21:00

一、OpenCV 中的运动模型

在典型的图像对齐问题中，我们有两个场景图像，它们通过运动模型相关联。不同的图像对齐算法旨在使用不同的技巧和假设来估计这些运动模型的参数。一旦知道了这些参数，就可以直接扭曲一张图像以使其与另一张图像对齐。

让我们快速看看这些运动模型是什么样的。

OpenCV每日函数对象追踪模块使用增强相关系数 (ECC) 最大化的图像配准_第1张图片

这显示了正方形的图像如何通过不同的运动模型进行转换

代表这些模型的 OpenCV 常量具有前缀 MOTION_ 并显示在括号内。

平移（MOTION_TRANSLATION）：可以将第一张图像平移（平移）（x，y）以获得第二张图像。我们只需要估计两个参数x和y 。

欧几里得 ( MOTION_EUCLIDEAN ) : 第一个图像是第二个图像的旋转和移动版本。所以有三个参数 - x，y和angle。您会在图 4 中注意到，当正方形经过欧几里得变换时，大小不变，平行线保持平行，变换后直角保持不变。

仿射（MOTION_AFFINE）：仿射变换是旋转、平移（移位）、缩放和剪切的组合。这个变换有六个参数。当正方形进行仿射变换时，平行线保持平行，但以直角相交的线不再保持正交。

Homography ( MOTION_HOMOGRAPHY )：上面描述的所有变换都是 2D 变换。它们不考虑 3D 效果。另一方面，单应变换可以解释一些 3D 效果（但不是全部）。这个变换有 8 个参数。使用 Homography 变换的正方形可以变为任何四边形。

在 OpenCV 中，仿射变换存储在2 x 3大小的矩阵中。平移和欧几里得变换是仿射变换的特例。在平移中，旋转、比例和剪切参数为零，而在欧几里德变换中，比例和剪切参数为零。所以平移和欧几里得变换也存储在一个2 x 3矩阵中。一旦估计了这个矩阵（我们将在下一节中看到），就可以使用函数warpAffine 将图像对齐。

另一方面，单应性存储在3 x 3矩阵中。一旦估计了 Homography，就可以使用warpPerspective将图像对齐。

二、findTransformECC函数概述

findTransformECC函数根据 ECC 标准查找两个图像之间的几何变换（warpMatrix）。

$\texttt{warpMatrix} = \arg\max_{W} \texttt{ECC}(\texttt{templateImage}(x,y),\texttt{inputImage}(x',y'))$ ，其中 $\begin{bmatrix} x' \\ y' \end{bmatrix} = W \cdot \begin{bmatrix} x \\ y \\ 1 \end{bmatrix}$ 。（该方程适用于单应性的齐次坐标）。它返回最终增强的相关系数，即模板图像和最终扭曲输入图像之间的相关系数。当一个3 × 3矩阵以motionType = 0、1或2给出，第三行被忽略。

与 findHomography 和 estimateRigidTransform 不同，函数 findTransformECC 实现了基于强度相似性的基于区域的对齐。本质上，该函数会更新大致对齐图像的初始转换。如果缺少此信息，则将身份扭曲（统一矩阵）用作初始化。请注意，如果图像经历强烈的位移/旋转，则需要进行大致对齐图像的初始变换（例如，允许图像大致显示相同图像内容的简单欧几里德/相似性变换）。在第二个图像中使用反向扭曲来获取接近第一个图像的图像，即使用带有 warpAffine 或 warpPerspective 的标志 WARP_INVERSE_MAP。

OpenCV 3 中引入的 ECC 图像对齐算法基于Georgios D. Evangelidis 和 Emmanouil Z. Psarakis于 2008 年发表的题为使用增强相关系数最大化的参数图像对齐的论文。他们建议使用一种称为增强相关系数 (ECC)的新相似性度量来估计运动模型的参数。使用他们的方法有两个优点。

1、与像素强度差异的传统相似性度量不同，ECC 对对比度和亮度的光度失真是不变的。

2、虽然目标函数是参数的非线性函数，但他们为解决优化问题而开发的迭代方案是线性的。换句话说，他们解决了一个表面上看起来计算量很大的问题，并找到了一种更简单的方法来迭代地解决它。

1、函数原型

double 	cv::findTransformECC (InputArray templateImage, InputArray inputImage, InputOutputArray warpMatrix, int motionType, TermCriteria criteria, InputArray inputMask, int gaussFiltSize)

2、参数详解

templateImage	单通道模板图像； CV_8U 或 CV_32F 。
inputImage	单通道输入图像，应使用最终的 warpMatrix 进行变形，以提供类似于 templateImage 的图像，与 templateImage 相同类型。
warpMatrix	浮点 2×3 或 3×3 映射矩阵（扭曲）。
motionType	参数，指定运动的类型： MOTION_TRANSLATION 设置平移运动模型； warpMatrix 是 2×3，前 2×2 部分是单位矩阵，其余两个参数是估计的。 MOTION_EUCLIDEAN 将欧几里得（刚性）变换设置为运动模型；估计三个参数； warpMatrix 是 2×3。 MOTION_AFFINE 设置仿射运动模型（默认）；估计六个参数； warpMatrix 是 2×3。 MOTION_HOMOGRAPHY 将单应性设置为运动模型；估计了八个参数；`warpMatrix` 为 3×3。
criteria	参数，指定ECC算法的终止标准；标准.epsilon 定义了两次迭代之间相关系数增量的阈值（负标准.epsilon 使标准.maxcount 成为唯一的终止标准）。默认值显示在上面的声明中。
inputMask	一个可选掩码，用于指示 inputImage 的有效值。
gaussFiltSize	表示高斯模糊滤波器大小的可选值；（默认：5）

三、OpenCV源码

1、源码路径

opencv\modules\video\src\ecc.cpp

2、源码代码

double cv::findTransformECC(InputArray templateImage,
                            InputArray inputImage,
                            InputOutputArray warpMatrix,
                            int motionType,
                            TermCriteria criteria,
                            InputArray inputMask,
                            int gaussFiltSize)
{


    Mat src = templateImage.getMat();//template image
    Mat dst = inputImage.getMat(); //input image (to be warped)
    Mat map = warpMatrix.getMat(); //warp (transformation)

    CV_Assert(!src.empty());
    CV_Assert(!dst.empty());

    // If the user passed an un-initialized warpMatrix, initialize to identity
    if(map.empty()) {
        int rowCount = 2;
        if(motionType == MOTION_HOMOGRAPHY)
            rowCount = 3;

        warpMatrix.create(rowCount, 3, CV_32FC1);
        map = warpMatrix.getMat();
        map = Mat::eye(rowCount, 3, CV_32F);
    }

    if( ! (src.type()==dst.type()))
        CV_Error( Error::StsUnmatchedFormats, "Both input images must have the same data type" );

    //accept only 1-channel images
    if( src.type() != CV_8UC1 && src.type()!= CV_32FC1)
        CV_Error( Error::StsUnsupportedFormat, "Images must have 8uC1 or 32fC1 type");

    if( map.type() != CV_32FC1)
        CV_Error( Error::StsUnsupportedFormat, "warpMatrix must be single-channel floating-point matrix");

    CV_Assert (map.cols == 3);
    CV_Assert (map.rows == 2 || map.rows ==3);

    CV_Assert (motionType == MOTION_AFFINE || motionType == MOTION_HOMOGRAPHY ||
        motionType == MOTION_EUCLIDEAN || motionType == MOTION_TRANSLATION);

    if (motionType == MOTION_HOMOGRAPHY){
        CV_Assert (map.rows ==3);
    }

    CV_Assert (criteria.type & TermCriteria::COUNT || criteria.type & TermCriteria::EPS);
    const int    numberOfIterations = (criteria.type & TermCriteria::COUNT) ? criteria.maxCount : 200;
    const double termination_eps    = (criteria.type & TermCriteria::EPS)   ? criteria.epsilon  :  -1;

    int paramTemp = 6;//default: affine
    switch (motionType){
      case MOTION_TRANSLATION:
          paramTemp = 2;
          break;
      case MOTION_EUCLIDEAN:
          paramTemp = 3;
          break;
      case MOTION_HOMOGRAPHY:
          paramTemp = 8;
          break;
    }


    const int numberOfParameters = paramTemp;

    const int ws = src.cols;
    const int hs = src.rows;
    const int wd = dst.cols;
    const int hd = dst.rows;

    Mat Xcoord = Mat(1, ws, CV_32F);
    Mat Ycoord = Mat(hs, 1, CV_32F);
    Mat Xgrid = Mat(hs, ws, CV_32F);
    Mat Ygrid = Mat(hs, ws, CV_32F);

    float* XcoPtr = Xcoord.ptr(0);
    float* YcoPtr = Ycoord.ptr(0);
    int j;
    for (j=0; j= termination_eps); i++)
    {

        // warp-back portion of the inputImage and gradients to the coordinate space of the templateImage
        if (motionType != MOTION_HOMOGRAPHY)
        {
            warpAffine(imageFloat, imageWarped,     map, imageWarped.size(),     imageFlags);
            warpAffine(gradientX,  gradientXWarped, map, gradientXWarped.size(), imageFlags);
            warpAffine(gradientY,  gradientYWarped, map, gradientYWarped.size(), imageFlags);
            warpAffine(preMask,    imageMask,       map, imageMask.size(),       maskFlags);
        }
        else
        {
            warpPerspective(imageFloat, imageWarped,     map, imageWarped.size(),     imageFlags);
            warpPerspective(gradientX,  gradientXWarped, map, gradientXWarped.size(), imageFlags);
            warpPerspective(gradientY,  gradientYWarped, map, gradientYWarped.size(), imageFlags);
            warpPerspective(preMask,    imageMask,       map, imageMask.size(),       maskFlags);
        }

        Scalar imgMean, imgStd, tmpMean, tmpStd;
        meanStdDev(imageWarped,   imgMean, imgStd, imageMask);
        meanStdDev(templateFloat, tmpMean, tmpStd, imageMask);

        subtract(imageWarped,   imgMean, imageWarped, imageMask);//zero-mean input
        templateZM = Mat::zeros(templateZM.rows, templateZM.cols, templateZM.type());
        subtract(templateFloat, tmpMean, templateZM,  imageMask);//zero-mean template

        const double tmpNorm = std::sqrt(countNonZero(imageMask)*(tmpStd.val[0])*(tmpStd.val[0]));
        const double imgNorm = std::sqrt(countNonZero(imageMask)*(imgStd.val[0])*(imgStd.val[0]));

        // calculate jacobian of image wrt parameters
        switch (motionType){
            case MOTION_AFFINE:
                image_jacobian_affine_ECC(gradientXWarped, gradientYWarped, Xgrid, Ygrid, jacobian);
                break;
            case MOTION_HOMOGRAPHY:
                image_jacobian_homo_ECC(gradientXWarped, gradientYWarped, Xgrid, Ygrid, map, jacobian);
                break;
            case MOTION_TRANSLATION:
                image_jacobian_translation_ECC(gradientXWarped, gradientYWarped, jacobian);
                break;
            case MOTION_EUCLIDEAN:
                image_jacobian_euclidean_ECC(gradientXWarped, gradientYWarped, Xgrid, Ygrid, map, jacobian);
                break;
        }

        // calculate Hessian and its inverse
        project_onto_jacobian_ECC(jacobian, jacobian, hessian);

        hessianInv = hessian.inv();

        const double correlation = templateZM.dot(imageWarped);

        // calculate enhanced correlation coefficient (ECC)->rho
        last_rho = rho;
        rho = correlation/(imgNorm*tmpNorm);
        if (cvIsNaN(rho)) {
          CV_Error(Error::StsNoConv, "NaN encountered.");
        }

        // project images into jacobian
        project_onto_jacobian_ECC( jacobian, imageWarped, imageProjection);
        project_onto_jacobian_ECC(jacobian, templateZM, templateProjection);


        // calculate the parameter lambda to account for illumination variation
        imageProjectionHessian = hessianInv*imageProjection;
        const double lambda_n = (imgNorm*imgNorm) - imageProjection.dot(imageProjectionHessian);
        const double lambda_d = correlation - templateProjection.dot(imageProjectionHessian);
        if (lambda_d <= 0.0)
        {
            rho = -1;
            CV_Error(Error::StsNoConv, "The algorithm stopped before its convergence. The correlation is going to be minimized. Images may be uncorrelated or non-overlapped");

        }
        const double lambda = (lambda_n/lambda_d);

        // estimate the update step delta_p
        error = lambda*templateZM - imageWarped;
        project_onto_jacobian_ECC(jacobian, error, errorProjection);
        deltaP = hessianInv * errorProjection;

        // update warping matrix
        update_warping_matrix_ECC( map, deltaP, motionType);


    }

    // return final correlation coefficient
    return rho;
}

double cv::findTransformECC(InputArray templateImage, InputArray inputImage,
    InputOutputArray warpMatrix, int motionType,
    TermCriteria criteria,
    InputArray inputMask)
{
    // Use default value of 5 for gaussFiltSize to maintain backward compatibility.
    return findTransformECC(templateImage, inputImage, warpMatrix, motionType, criteria, inputMask, 5);
}

四、效果图像示例

在OpenCV中，ECC 图像对齐的运动模型使用函数findTransformECC进行估计。以下是使用 findTransformECC 的步骤

读取图像并转换为灰度。

选择您要估计的运动模型。

分配空间（warp_matrix）来存储运动模型。

定义一个终止标准，告诉算法何时停止。

使用 findTransformECC 估计变换矩阵。

将变换矩阵应用于其中一个图像以将其与另一张图像对齐。

下面实例中用到的原始图片，可以在下面网盘获得

链接：https://pan.baidu.com/s/1KDY8VBzgaCn276tgLH_VmA 
提取码：yuks

1、简单对齐示例

下面示例代码可以用来解决及其简单的对齐问题，现实视觉问题往往非常复杂。

// Read the images to be aligned
	Mat im1 = imread("images\\image1.jpg");
	Mat im2 = imread("images\\image2.jpg");

	// Convert images to gray scale;
	Mat im1_gray, im2_gray;
	cvtColor(im1, im1_gray, cv::COLOR_BGR2GRAY);
	cvtColor(im2, im2_gray, cv::COLOR_BGR2GRAY);

	// Define the motion model
	const int warp_mode = MOTION_EUCLIDEAN;

	// Set a 2x3 or 3x3 warp matrix depending on the motion model.
	Mat warp_matrix;

	// Initialize the matrix to identity
	if (warp_mode == MOTION_HOMOGRAPHY)
		warp_matrix = Mat::eye(3, 3, CV_32F);
	else
		warp_matrix = Mat::eye(2, 3, CV_32F);

	// Specify the number of iterations.
	int number_of_iterations = 5000;

	// Specify the threshold of the increment
	// in the correlation coefficient between two iterations
	double termination_eps = 1e-10;

	// Define termination criteria
	TermCriteria criteria(TermCriteria::COUNT + TermCriteria::EPS, number_of_iterations, termination_eps);

	// Run the ECC algorithm. The results are stored in warp_matrix.
	findTransformECC(
		im1_gray,
		im2_gray,
		warp_matrix,
		warp_mode,
		criteria
	);

	// Storage for warped image.
	Mat im2_aligned;

	if (warp_mode != MOTION_HOMOGRAPHY)
		// Use warpAffine for Translation, Euclidean and Affine
		warpAffine(im2, im2_aligned, warp_matrix, im1.size(), INTER_LINEAR + WARP_INVERSE_MAP);
	else
		// Use warpPerspective for Homography
		warpPerspective(im2, im2_aligned, warp_matrix, im1.size(), INTER_LINEAR + WARP_INVERSE_MAP);

	cv::imwrite("Aligned.jpg", im2_aligned);

OpenCV每日函数对象追踪模块使用增强相关系数 (ECC) 最大化的图像配准_第4张图片

对齐结果

2、通道对齐示例

//读取 8 位彩色图像。这是三个通道垂直连接的图像。
	Mat im = imread("images\\emir.jpg", IMREAD_GRAYSCALE);

	// Find the width and height of the color image
	Size sz = im.size();
	int height = sz.height / 3;
	int width = sz.width;

	// Extract the three channels from the gray scale image
	vectorchannels;
	channels.push_back(im(Rect(0, 0, width, height)));
	channels.push_back(im(Rect(0, height, width, height)));
	channels.push_back(im(Rect(0, 2 * height, width, height)));

	// Merge the three channels into one color image
	Mat im_color;
	merge(channels, im_color);

	// Set space for aligned image.
	vector aligned_channels;
	aligned_channels.push_back(Mat(height, width, CV_8UC1));
	aligned_channels.push_back(Mat(height, width, CV_8UC1));

	// The blue and green channels will be aligned to the red channel.
	// So copy the red channel
	aligned_channels.push_back(channels[2].clone());


	// Define motion model
	const int warp_mode = MOTION_AFFINE;

	// Set space for warp matrix.
	Mat warp_matrix;

	// Set the warp matrix to identity.
	if (warp_mode == MOTION_HOMOGRAPHY)
		warp_matrix = Mat::eye(3, 3, CV_32F);
	else
		warp_matrix = Mat::eye(2, 3, CV_32F);

	// Set the stopping criteria for the algorithm.
	int number_of_iterations = 5000;
	double termination_eps = 1e-10;

	TermCriteria criteria(TermCriteria::COUNT + TermCriteria::EPS,
		number_of_iterations, termination_eps);


	// Warp the blue and green channels to the red channel
	for (int i = 0; i < 2; i++)
	{

		double cc = findTransformECC(
			GetGradient(channels[2]),
			GetGradient(channels[i]),
			warp_matrix,
			warp_mode,
			criteria
		);

		cout << "warp_matrix : " << warp_matrix << endl;
		cout << "CC " << cc << endl;
		if (cc == -1)
		{
			cerr << "The execution was interrupted. The correlation value is going to be minimized." << endl;
			cerr << "Check the warp initialization and/or the size of images." << endl << flush;
		}


		if (warp_mode == MOTION_HOMOGRAPHY)
			// Use Perspective warp when the transformation is a Homography
			warpPerspective(channels[i], aligned_channels[i], warp_matrix, aligned_channels[0].size(), INTER_LINEAR + WARP_INVERSE_MAP);
		else
			// Use Affine warp when the transformation is not a Homography
			warpAffine(channels[i], aligned_channels[i], warp_matrix, aligned_channels[0].size(), INTER_LINEAR + WARP_INVERSE_MAP);

	}

	// Merge the three channels
	Mat im_aligned;
	merge(aligned_channels, im_aligned);

	// Show final output
	imshow("Color Image", im_color);
	imshow("Aligned Image", im_aligned);
	waitKey(0);

上图是RGB通道未对齐的图像，下图是对齐后的图像。

OpenCV每日函数 对象追踪模块 使用增强相关系数 (ECC) 最大化的图像配准