随笔 - 54  文章 - 0  评论 - 45  阅读 - 45万 

    透视 n 点问题,简称 PnP,源自相机标定,是计算机视觉的经典问题,广泛应用在机器人定位、SLAM、AR/VR、摄影测量等领域

1  PnP 问题

1.1  定义

    已知:相机的内参和畸变系数;世界坐标系中,n 个空间点坐标,以及投影在像平面上的像素坐标

    求解:相机在世界坐标系下的位姿 R 和 t,即 {W} 到 {C} 的变换矩阵 cwT,如下图:

                

    世界坐标系中的 3d 空间点,与投影到像平面的 2d 像素点,两者之间的关系为:

    s[uv1]=[fx0cx0fycy001][r11r12r13t1r21r22r23t2r31r32r33t3][XwYwZw1]

1.2  分类

    根据给定空间点的数量,可将 PnP 问题分为两类:

    第一类  3≤n≤5,选取的空间点较少,可通过联立方程组的方式求解,精度易受图像噪声影响,鲁棒性较差

    第二类  n≥6,选取的空间点较多,可转化为求解超定方程的问题,一般侧重于鲁棒性和实时性的平衡

 

2  求解方法

2.1  DLT 法

2.1.1  转化为 Ax=0

    令 P=K[Rt]K 为相机内参矩阵,则 PnP 问题可简化为:已知 n 组 3d-2d 对应点,求解 P3×4  

    DLT (Direct Linear Transformation,直接线性变换),便是直接利用这 n 组对应点,构建线性方程组来求解

    s[uv1]=[p11p12p13p14p21p22p23p23p31p32p33p33][XwYwZw1]

    简化符号 Xw,Yw,ZwX,Y,Z,展开得: 

    {su=p11X+p12Y+p13Z+p14sv=p21X+p22Y+p23Z+p24s=p31X+p32Y+p33Z+p34=>{Xp11+Yp12+Zp13+p14uXp31uYp32uZp33up34=0Xp21+Yp22+Zp23+p24vXp31vYp32vZp33vp34=0   

    未知数有 11 个 (p34可约掉),则至少需要 6 组对应点,写成矩阵形式如下:

    [X1Y1Z110000u1X1u1Y1u1Z1u10000X1Y1Z11v1X1v1Y1v1Z1v1XnYnZn10000unXnunY1unZnun0000XnYnZn1vnXnvnYnvnZnvn][p11p12p13p14p32p33p34]=[00]

    因此,求解 P3×4 便转化成了 Ax=0 的问题  

2.1.2  SVD 求 R t

    给定相机内参矩阵,则有  K[Rt]=λ[p1p2p3p4]

    考虑 λ 符号无关,得  λR=K1[p1p2p3]

    SVD 分解  K1[p1p2p3]=U[d11d22d33]VT   

    =>λd11  和  {R=UVTt=K1p4d11

2.2  P3P 法

    当 n=3 时,PnP 即为 P3P,它有 4 个可能的解,求解方法是 余弦定理 + 向量点积

2.2.1  余弦定理

    根据投影几何的消隐点和消隐线,构建 3d-2d 之间的几何关系,如下:

      

    根据余弦定理,则有

    {d12+d222d1d2cosθ12=p122d22+d322d2d3cosθ23=p232d32+d12+2d3d2cosθ23=p312

    只有 d1,d2,d3 是未知数,求解方程组即可

    其中,有个关键的隐含点:给定相机内参,以及 3d-2d 的投影关系,则消隐线之间的夹角 θ12θ23θ31 是可计算得出的 

2.2.2  向量点积

    相机坐标系中,原点即为消隐点,原点到 3d-2d 的连线即为消隐线,如图所示:

                      

    如果知道 3d点 投影到像平面的 2d点,在相机坐标系中的坐标 U1,U2,U3,则 cosθ23=OU2OU3||OU2||||OU3||        

    具体到运算,可视为 世界坐标系 {W} 和 相机坐标系 {C} 重合,且 Z=f,则有

    [Rt]=[100001000010]=> s[uv1]=[fx0cx0fycy001][XcYcZc]

    K1 可用增广矩阵求得,且 Zc=f,则有

    [XcYcf]=sK1[uv1]

    记 u=[XcYcZc],则 cosθ12=(K1u1)T(K1u2)||K1u1||||K1u2||,以此类推 cosθ23cosθ31

  

3  OpenCV 函数

    OpenCV 中解 PnP 的方法有 9 种,目前实现了 7 种,还有 2 种未实现,对应论文如下:

     -  SOLVEPNP_P3P               Complete Solution Classification for the Perspective-Three-Point Problem

     -  SOLVEPNP_AP3P             An Efficient Algebraic Solution to the Perspective-Three-Point Problem

     -  SOLVEPNP_ITERATIVE             基于 L-M 最优化方法,求解重投影误差最小的位姿

     -  SOLVEPNP_EPNP                           EPnP: An Accurate O(n) Solution to the PnP Problem

     -  SOLVEPNP_SQPNP                         A Consistently Fast and Globally Optimal Solution to the Perspective-n-Point Problem

     -  SOLVEPNP_IPPE                                        Infinitesimal Plane-based Pose Estimation    输入的 3D 点需要共面且 n ≥ 4

     -  SOLVEPNP_IPPE_SQUARE                         SOLVEPNP_IPPE 的一种特殊情况,要求输入 4 个共面点的坐标,并且按照特定的顺序排列

     -  SOLVEPNP_DLS (未实现)                   A Direct Least-Squares (DLS) Method for PnP     实际调用 SOLVEPNP_EPNP

     -  SOLVEPNP_UPLP (未实现)                 Exhaustive Linearization for Robust Camera Pose and Focal Length Estimation    实际调用 SOLVEPNP_EPNP

3.1  solveP3P()

    solveP3P() 的输入是 3 组 3d-2d 对应点,定义如下:

1
2
3
4
5
6
7
8
9
10
// P3P has up to 4 solutions, and the solutions are sorted by reprojection errors(lowest to highest).
int solveP3P (
    InputArray     objectPoints,     // object points, 3x3 1-channel or 1x3/3x1 3-channel. vector<Point3f> can be also passed
    InputArray      imagePoints,     // corresponding image points, 3x2 1-channel or 1x3/3x1 2-channel. vector<Point2f> can be also passed
    InputArray     cameraMatrix,     // camera intrinsic matrix
    InputArray       distCoeffs,     // distortion coefficients.If NULL/empty, the zero distortion coefficients are assumed.
    OutputArrayOfArrays   rvecs,     // rotation vectors
    OutputArrayOfArrays   tvecs,     // translation vectors
    int                   flags      // solving method
);   

3.2  solvePnP() 和 solvePnPGeneric() 

    solvePnP() 实际上调用的是 solvePnPGeneric(),内部实现如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
bool solvePnP(InputArray opoints, InputArray ipoints, InputArray cameraMatrix, InputArray distCoeffs, OutputArray rvec, OutputArray tvec, bool useExtrinsicGuess, int flags)
{
    CV_INSTRUMENT_REGION();
 
    vector<Mat> rvecs, tvecs;
    int solutions = solvePnPGeneric(opoints, ipoints, cameraMatrix, distCoeffs, rvecs, tvecs, useExtrinsicGuess, (SolvePnPMethod)flags, rvec, tvec);
 
    if (solutions > 0)
    {
        int rdepth = rvec.empty() ? CV_64F : rvec.depth();
        int tdepth = tvec.empty() ? CV_64F : tvec.depth();
        rvecs[0].convertTo(rvec, rdepth);
        tvecs[0].convertTo(tvec, tdepth);
    }
 
    return solutions > 0;
}  

    solvePnPGeneric() 除了求解相机位姿外,还可得到重投影误差,其定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
bool solvePnPGeneric (
    InputArray         objectPoints,    // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points. vector<Point3d> can be also passed
    InputArray          imagePoints,    // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel, N is the number of points. vector<Point2d> can be also passed
    InputArray         cameraMatrix,    // camera intrinsic matrix
    InputArray           distCoeffs,    // distortion coefficients
    OutputArrayOfArrays        rvec,    // rotation vector
    OutputArrayOfArrays        tvec,    // translation vector
    bool  useExtrinsicGuess = false,    // used for SOLVEPNP_ITERATIVE. If true, use the provided rvec and tvec as initial approximations, and further optimize them.
    SolvePnPMethod  flags = SOLVEPNP_ITERATIVE, // solving method
    InputArray               rvec = noArray(),  // initial rotation vector when using SOLVEPNP_ITERATIVE and useExtrinsicGuess is set to true
    InputArray               tvec = noArray(),  // initial translation vector when using SOLVEPNP_ITERATIVE and useExtrinsicGuess is set to true
    OutputArray reprojectionError = noArray()   // optional vector of reprojection error, that is the RMS error
);   

3.3  solvePnPRansac()

    solvePnP() 的一个缺点是鲁棒性不强,对异常点敏感,这在相机标定中问题不大,因为标定板的图案已知,并且特征提取较为稳定

    然而,当相机拍摄实际物体时,因为特征难以稳定提取,会出现一些异常点,导致位姿估计的不准,因此,需要一种处理异常点的方法

    RANSAC 便是一种高效剔除异常点的方法,对应 solvePnPRansac(),它是一个重载函数,共有 2 种参数形式,第 1 种形式如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
bool solvePnPRansac (
    InputArray         objectPoints,    // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points. vector<Point3d> can be also passed
    InputArray          imagePoints,    // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel, N is the number of points. vector<Point2d> can be also passed
    InputArray         cameraMatrix,    // camera intrinsic matrix
    InputArray           distCoeffs,    // distortion coefficients
    OutputArray                rvec,    // rotation vector
    OutputArray                tvec,    // translation vector
    bool  useExtrinsicGuess = false,    // used for SOLVEPNP_ITERATIVE. If true, use the provided rvec and tvec as initial approximations, and further optimize them.
    int       iterationsCount = 100,    // number of iterations
    float   reprojectionError = 8.0,    // inlier threshold value. It is the maximum allowed distance between the observed and computed point projections to consider it an inlier
    double        confidence = 0.99,    // the probability that the algorithm produces a useful result
    OutputArray inliers = noArray(),    // output vector that contains indices of inliers in objectPoints and imagePoints
    int  flags = SOLVEPNP_ITERATIVE     // solving method
);  

 3.4  solvePnPRefineLM() 和 solvePnPRefineVVS()

    OpenCV 中还有 2 个位姿细化函数:通过迭代不断减小重投影误差,从而求得最佳位姿,solvePnPRefineLM() 使用 L-M 算法,solvePnPRefineVVS() 则用虚拟视觉伺服 (Virtual Visual Servoing)

    solvePnPRefineLM() 的定义如下:

1
2
3
4
5
6
7
8
9
  void solvePnPRefineLM (
      InputArray      objectPoints,  // object points, Nx3 1-channel or 1xN/Nx1 3-channel, N is the number of points
      InputArray       imagePoints,  // corresponding image points, Nx2 1-channel or 1xN/Nx1 2-channel
      InputArray      cameraMatrix,  // camera intrinsic matrix
      InputArray        distCoeffs,  // distortion coefficients
      InputOutputArray      rvec,      // input/output rotation vector
      InputOutputArray      tvec,      // input/output translation vector     
      TermCriteria  criteria = TermCriteria(TermCriteria::EPS+TermCriteria::COUNT, 20, FLT_EPSILON) // Criteria when to stop the LM iterative algorithm
);   

 

4  应用实例

4.1  位姿估计 (静态+标定板) 

    当手持标定板旋转不同角度时,利用相机内参 + solvePnP(),便可求出相机相对标定板的位姿

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
#include "opencv2/imgproc.hpp"
#include "opencv2/highgui.hpp"
#include "opencv2/calib3d.hpp"
 
using namespace std;
using namespace cv;
 
Size kPatternSize = Size(9, 6);
float kSquareSize = 0.025;
// camera intrinsic parameters and distortion coefficient
const Mat cameraMatrix = (Mat_<double>(3, 3) << 5.3591573396163199e+02, 0.0, 3.4228315473308373e+02,
                                                0.0, 5.3591573396163199e+02, 2.3557082909788173e+02,
                                                0.0, 0.0, 1.0);
const Mat distCoeffs = (Mat_<double>(5, 1) << -2.6637260909660682e-01, -3.8588898922304653e-02, 1.7831947042852964e-03,
                                              -2.8122100441115472e-04, 2.3839153080878486e-01);
 
int main()
{
    // 1) read image
    Mat src = imread("left07.jpg");
    if (src.empty())
        return -1;
    // prepare for subpixel corner
    Mat src_gray;
    cvtColor(src, src_gray, COLOR_BGR2GRAY);
 
    // 2) find chessboard corners and subpixel refining
    vector<Point2f> corners;
    bool patternfound = findChessboardCorners(src, kPatternSize, corners);
    if (patternfound) {
        cornerSubPix(src_gray, corners, Size(11, 11), Size(-1, -1), TermCriteria(TermCriteria::EPS + TermCriteria::MAX_ITER, 30, 0.1));
    }
    else {
        return -1;
    }
 
    // 3) object coordinates
    vector<Point3f> objectPoints;
    for (int i = 0; i < kPatternSize.height; i++)
    {
        for (int j = 0; j < kPatternSize.width; j++)
        {
            objectPoints.push_back(Point3f(float(j * kSquareSize), float(i * kSquareSize), 0));
        }
    }
 
    // 4) Rotation and Translation vectors
    Mat rvec, tvec;
    solvePnP(objectPoints, corners, cameraMatrix, distCoeffs, rvec, tvec);
 
    // 5) project estimated pose on the image
    drawFrameAxes(src, cameraMatrix, distCoeffs, rvec, tvec, 2*kSquareSize);
    imshow("Pose estimation", src);
    waitKey();
}  

    当标定板旋转不同角度时,相机所对应的位姿如下:

                   

4.2  位姿估计 (实时+任意物)

    OpenCV 中有一个实时目标跟踪例程,位于 "opencv\samples\cpp\tutorial_code\calib3d\real_time_pose_estimation" 中,实现步骤如下:

    1)  读取目标的三维模型和网格 -> 2)  获取视频流 -> 3)  ORB 特征检测 -> 4) 3d-2d 特征匹配 -> 5)  相机位姿估计 -> 6)  卡尔曼滤波

    例程中设计了一个 PnPProblem 类来实现位姿估计,其中 2 个重要的函数 estimatePoseRANSAC() 和 backproject3DPoint() 定义如下:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
class PnPProblem
{
public:
    explicit PnPProblem(const double param[]);  // custom constructor
    virtual ~PnPProblem();
 
    cv::Point2f backproject3DPoint(const cv::Point3f& point3d);
 
    void estimatePoseRANSAC(const std::vector<cv::Point3f>& list_points3d, const std::vector<cv::Point2f>& list_points2d,
                            int flags, cv::Mat& inliers, int iterationsCount, float reprojectionError, double confidence);
    // ...
}
 
// Custom constructor given the intrinsic camera parameters
PnPProblem::PnPProblem(const double params[])
{
    // intrinsic camera parameters
    _A_matrix = cv::Mat::zeros(3, 3, CV_64FC1);  
    _A_matrix.at<double>(0, 0) = params[0];       //  [ fx   0  cx ]
    _A_matrix.at<double>(1, 1) = params[1];       //  [  0  fy  cy ]
    _A_matrix.at<double>(0, 2) = params[2];       //  [  0   0   1 ]
    _A_matrix.at<double>(1, 2) = params[3];
    _A_matrix.at<double>(2, 2) = 1;
    // rotation matrix, translation matrix, rotation-translation matrix
    _R_matrix = cv::Mat::zeros(3, 3, CV_64FC1);
    _t_matrix = cv::Mat::zeros(3, 1, CV_64FC1);
    _P_matrix = cv::Mat::zeros(3, 4, CV_64FC1);
}
 
// Estimate the pose given a list of 2D/3D correspondences with RANSAC and the method to use
void PnPProblem::estimatePoseRANSAC (
    const std::vector<Point3f>& list_points3d,        // list with model 3D coordinates
    const std::vector<Point2f>& list_points2d,        // list with scene 2D coordinates
    int flags, Mat& inliers, int iterationsCount,     // PnP method; inliers container
    float reprojectionError, float confidence)        // RANSAC parameters
{
    // distortion coefficients, rotation vector and translation vector
    Mat distCoeffs = Mat::zeros(4, 1, CV_64FC1);
    Mat rvec = Mat::zeros(3, 1, CV_64FC1);         
    Mat tvec = Mat::zeros(3, 1, CV_64FC1);  
    // no initial approximations
    bool useExtrinsicGuess = false;          
 
    // PnP + RANSAC
    solvePnPRansac(list_points3d, list_points2d, _A_matrix, distCoeffs, rvec, tvec, useExtrinsicGuess, iterationsCount, reprojectionError, confidence, inliers, flags);
 
    // converts Rotation Vector to Matrix
    Rodrigues(rvec, _R_matrix);                  
    _t_matrix = tvec;                            // set translation matrix
    this->set_P_matrix(_R_matrix, _t_matrix);    // set rotation-translation matrix
}
 
// Backproject a 3D point to 2D using the estimated pose parameters
cv::Point2f PnPProblem::backproject3DPoint(const cv::Point3f& point3d)
{
    // 3D point vector [x y z 1]'
    cv::Mat point3d_vec = cv::Mat(4, 1, CV_64FC1);
    point3d_vec.at<double>(0) = point3d.x;
    point3d_vec.at<double>(1) = point3d.y;
    point3d_vec.at<double>(2) = point3d.z;
    point3d_vec.at<double>(3) = 1;
 
    // 2D point vector [u v 1]'
    cv::Mat point2d_vec = cv::Mat(4, 1, CV_64FC1);
    point2d_vec = _A_matrix * _P_matrix * point3d_vec;
 
    // Normalization of [u v]'
    cv::Point2f point2d;
    point2d.x = (float)(point2d_vec.at<double>(0) / point2d_vec.at<double>(2));
    point2d.y = (float)(point2d_vec.at<double>(1) / point2d_vec.at<double>(2));
 
    return point2d;
}

    PnPProblem 类的调用如下:实例化 -> estimatePoseRansac() 估计位姿 -> backproject3DPoint() 画出位姿

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
// Intrinsic camera parameters: UVC WEBCAM
double f = 55;                     // focal length in mm
double sx = 22.3, sy = 14.9;       // sensor size
double width = 640, height = 480;  // image size
double params_WEBCAM[] = { width * f / sx,   // fx
                           height * f / sy,  // fy
                           width / 2,      // cx
                           height / 2 };   // cy
// instantiate PnPProblem class
PnPProblem pnp_detection(params_WEBCAM);
 
// RANSAC parameters
int iterCount = 500;        // number of Ransac iterations.
float reprojectionError = 2.0;    // maximum allowed distance to consider it an inlier.
float confidence = 0.95;          // RANSAC successful confidence.
 
// OpenCV requires solvePnPRANSAC to minimally have 4 set of points
if (good_matches.size() >= 4)
{
    // -- Step 3: Estimate the pose using RANSAC approach
    pnp_detection.estimatePoseRANSAC(list_points3d_model_match, list_points2d_scene_match,
        pnpMethod, inliers_idx, iterCount, reprojectionError, confidence);
 
    // ... ..
}
 
// ... ... 
 
float fp = 5;
vector<Point2f> pose2d;
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, 0, 0))); // axis center
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(fp, 0, 0))); // axis x
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, fp, 0))); // axis y
pose2d.push_back(pnp_detect_est.backproject3DPoint(Point3f(0, 0, fp))); // axis z
 
draw3DCoordinateAxes(frame_vis, pose2d); // draw axes
 
// ... ...

  实时目标跟踪的效果如下:

            

    

 

参考资料

  OpenCV-Python Tutorials / Camera Calibration and 3D Reconstruction / Pose Estimation

  OpenCV Tutorials / Camera calibration and 3D reconstruction (calib3d module) / Real time pose estimation of a textured object

  一种改进的 PnP 问题求解算法研究[J]

  Perspective-n-Point, Hyun Soo Park

 

posted on   飞鸢逐浪  阅读(4888)  评论(0编辑  收藏  举报
编辑推荐:
· AI与.NET技术实操系列:向量存储与相似性搜索在 .NET 中的实现
· 基于Microsoft.Extensions.AI核心库实现RAG应用
· Linux系列:如何用heaptrack跟踪.NET程序的非托管内存泄露
· 开发者必知的日志记录最佳实践
· SQL Server 2025 AI相关能力初探
阅读排行:
· winform 绘制太阳,地球,月球 运作规律
· 震惊!C++程序真的从main开始吗?99%的程序员都答错了
· AI与.NET技术实操系列(五):向量存储与相似性搜索在 .NET 中的实现
· 【硬核科普】Trae如何「偷看」你的代码?零基础破解AI编程运行原理
· 超详细:普通电脑也行Windows部署deepseek R1训练数据并当服务器共享给他人
点击右上角即可分享
微信分享提示