人脸姿态的随机采样一致性(RANSAC)
Random Sample Consensus: A Paradigm for Model Fitting with Apphcatlons to Image Analysis and Automated Cartography
(早就听说过随机采样一致性,但论文还是第一次读,居然是一辆1981年的老爷车。)
在实现人脸姿态估计时,通过opencv自带的DLT(solvepnp)解算出来的值误差很大,而且不稳定。
经过一系列思考和验证,发现实现上并无差错。那么是什么造成了呢?
这是二阶段或多阶段模型的共性错误:通常为分类误差引起的gross error。按照论文所述,分类误差是不服从于正态分布的,所以难以平滑。但可以使用随机采样一致性解决。
随机采样一致性的大致步骤,可以用伪代码表示:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 | Given: data – a set of observed data points model – a model that can be fitted to data points n – the minimum number of data values required to fit the model k – the maximum number of iterations allowed in the algorithm t – a threshold value for determining when a data point fits a model d – the number of close data values required to assert that a model fits well to data Return: bestfit – model parameters which best fit the data (or nul if no good model is found) iterations = 0 bestfit = nul besterr = something really large while iterations < k { maybeinliers = n randomly selected values from data maybemodel = model parameters fitted to maybeinliers alsoinliers = empty set for every point in data not in maybeinliers { if point fits maybemodel with an error smaller than t add point to alsoinliers } if the number of elements in alsoinliers is > d { % this implies that we may have found a good model % now test how good it is bettermodel = model parameters fitted to all points in maybeinliers and alsoinliers thiserr = a measure of how well model fits these points if thiserr < besterr { bestfit = bettermodel besterr = thiserr } } increment iterations } return bestfit |
论文简单讨论了随机采样一致性的k、t值,t值应该取测量平均误差的两到三倍。
而k值,则给出了公式:
z表示的是可信度,b则表示随机选取一次,得到合适模型的概率。
当真实值淹没在噪音中,也可以使用该方法来找出群内点,但是如果b值过低,该方法是无效的。
蓝线为预测的鼻头指向,左图为单纯的dlt方法,右图加入了ransac。显然右图的预测效果要好一些。
另外给出人脸姿态的预测代码:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 | #ifndef CAL_ROT #define CAL_ROT #include <dlfcn.h> #include <stdio.h> #include <stdlib.h> #include <string.h> #include <sys/time.h> #include <iostream> #include <vector> #include "opencv2/core/core.hpp" #include "opencv2/opencv.hpp" #include "opencv2/highgui/highgui.hpp" using namespace std; using namespace cv; #define FOCAL 112 #define PI 3.1415<br> static vector<cv::Point3d> facePoints6p={ cv::Point3d(0,0,0), //nose tip 30 cv::Point3d(0.0,-330.0,65.0), //chin 8 cv::Point3d(-225.0,170.0,135.0), //36 cv::Point3d(225.0,170.0,135.0), //45 cv::Point3d(-150.0,-150.0,125.0), //48 cv::Point3d(150.0,-150.0,125.0), //54 }; cv::Scalar color[3]={cv::Scalar(255,0,0),cv::Scalar(0,255,0),cv::Scalar(0,0,255)}; void calRotate6P(cv::Mat &img,cv::Point2f land_mark[]) { double focal_length = FOCAL; // Approximate focal length. cv::Point2d center = cv::Point2d(FOCAL*0.5,FOCAL*0.5); cv::Mat camera_matrix = (cv::Mat_< double >(3, 3) << focal_length, 0, center.x, 0, focal_length, center.y, 0, 0, 1); cv::Mat dist_coeffs = cv::Mat::zeros(4, 1, cv::DataType< double >::type); // Assuming no lens distortion cv::Mat rotation_vector; // Rotation in axis-angle form cv::Mat translation_vector; vector<Point2d> landmark; landmark.push_back(land_mark[30]); landmark.push_back(land_mark[8]); landmark.push_back(land_mark[36]); landmark.push_back(land_mark[45]); landmark.push_back(land_mark[48]); landmark.push_back(land_mark[54]); // Solve for pose //cv::solvePnP(facePoints, landmark, camera_matrix, dist_coeffs, rotation_vector, translation_vector); cv::solvePnPRansac(facePoints6p, landmark, camera_matrix, dist_coeffs, rotation_vector, translation_vector); //rotation trans vector<Point3d> nose_end_point3d= {cv::Point3d(0,0,-100.0),cv::Point3d(0,100,0),cv::Point3d(100,0,0)}; vector<Point2d> nose_end_point2d; //draw rotate cv::projectPoints(nose_end_point3d,rotation_vector,translation_vector,camera_matrix,dist_coeffs,nose_end_point2d); //cv::circle(img,cv::Point(10,10),1,cv::Scalar(0,0,0),-1); for ( int i=0;i<nose_end_point2d.size();i++) { std::cout<< "point:" <<nose_end_point2d[i].x<< " " <<nose_end_point2d[i].y<<std::endl; cv::line(img,landmark[0],nose_end_point2d[i],color[i]); } //计算欧拉角 double r = norm(rotation_vector, NORM_L2); double w = cos (r / 2); double x = sin (r / 2)*(*rotation_vector.ptr< double >(0, 0))/r; double y = sin (r / 2)*(*rotation_vector.ptr< double >(1, 0))/r; double z = sin (r / 2)*(*rotation_vector.ptr< double >(2, 0))/r; //x double t0 = 2 * (w*x + y * z); double t1 = 1 - 2 * (x*x + y * y); double pitch = atan2 (t0, t1) / PI * 180; //y double t2 = 2 * (w*y - z * x); if (t2 > 1) { t2 = 1; } else if (t2 < -1) { t2 = -1; } double yaw = asin (t2) / PI * 180; //z double t3 = 2 * (w*z + x * y); double t4 = 1 - 2 * (y*y + z * z); double roll = atan2 (t3, t4)/PI*180; cout<<r<< "\txangle:" <<pitch<< "\tyangle:" <<yaw<< "\tzangle" <<roll<<endl; } #endif |
【推荐】国内首个AI IDE,深度理解中文开发场景,立即下载体验Trae
【推荐】编程新体验,更懂你的AI,立即体验豆包MarsCode编程助手
【推荐】抖音旗下AI助手豆包,你的智能百科全书,全免费不限次数
【推荐】轻量又高性能的 SSH 工具 IShell:AI 加持,快人一步
· 全程不用写代码,我用AI程序员写了一个飞机大战
· MongoDB 8.0这个新功能碉堡了,比商业数据库还牛
· 记一次.NET内存居高不下排查解决与启示
· DeepSeek 开源周回顾「GitHub 热点速览」
· 白话解读 Dapr 1.15:你的「微服务管家」又秀新绝活了