机器学习数据集
我经常用到的是鸢(yuan)尾花数据集,经典机器学习算法 的案例都是用这个数据集作为实例
具体内容如下:
5.1,3.5,1.4,0.2,Iris-setosa 4.9,3.0,1.4,0.2,Iris-setosa 4.7,3.2,1.3,0.2,Iris-setosa 4.6,3.1,1.5,0.2,Iris-setosa 5.0,3.6,1.4,0.2,Iris-setosa 5.4,3.9,1.7,0.4,Iris-setosa 4.6,3.4,1.4,0.3,Iris-setosa 5.0,3.4,1.5,0.2,Iris-setosa 4.4,2.9,1.4,0.2,Iris-setosa 4.9,3.1,1.5,0.1,Iris-setosa 5.4,3.7,1.5,0.2,Iris-setosa 4.8,3.4,1.6,0.2,Iris-setosa 4.8,3.0,1.4,0.1,Iris-setosa 4.3,3.0,1.1,0.1,Iris-setosa 5.8,4.0,1.2,0.2,Iris-setosa 5.7,4.4,1.5,0.4,Iris-setosa 5.4,3.9,1.3,0.4,Iris-setosa 5.1,3.5,1.4,0.3,Iris-setosa 5.7,3.8,1.7,0.3,Iris-setosa 5.1,3.8,1.5,0.3,Iris-setosa 5.4,3.4,1.7,0.2,Iris-setosa 5.1,3.7,1.5,0.4,Iris-setosa 4.6,3.6,1.0,0.2,Iris-setosa 5.1,3.3,1.7,0.5,Iris-setosa 4.8,3.4,1.9,0.2,Iris-setosa 5.0,3.0,1.6,0.2,Iris-setosa 5.0,3.4,1.6,0.4,Iris-setosa 5.2,3.5,1.5,0.2,Iris-setosa 5.2,3.4,1.4,0.2,Iris-setosa 4.7,3.2,1.6,0.2,Iris-setosa 4.8,3.1,1.6,0.2,Iris-setosa 5.4,3.4,1.5,0.4,Iris-setosa 5.2,4.1,1.5,0.1,Iris-setosa 5.5,4.2,1.4,0.2,Iris-setosa 4.9,3.1,1.5,0.1,Iris-setosa 5.0,3.2,1.2,0.2,Iris-setosa 5.5,3.5,1.3,0.2,Iris-setosa 4.9,3.1,1.5,0.1,Iris-setosa 4.4,3.0,1.3,0.2,Iris-setosa 5.1,3.4,1.5,0.2,Iris-setosa 5.0,3.5,1.3,0.3,Iris-setosa 4.5,2.3,1.3,0.3,Iris-setosa 4.4,3.2,1.3,0.2,Iris-setosa 5.0,3.5,1.6,0.6,Iris-setosa 5.1,3.8,1.9,0.4,Iris-setosa 4.8,3.0,1.4,0.3,Iris-setosa 5.1,3.8,1.6,0.2,Iris-setosa 4.6,3.2,1.4,0.2,Iris-setosa 5.3,3.7,1.5,0.2,Iris-setosa 5.0,3.3,1.4,0.2,Iris-setosa 7.0,3.2,4.7,1.4,Iris-versicolor 6.4,3.2,4.5,1.5,Iris-versicolor 6.9,3.1,4.9,1.5,Iris-versicolor 5.5,2.3,4.0,1.3,Iris-versicolor 6.5,2.8,4.6,1.5,Iris-versicolor 5.7,2.8,4.5,1.3,Iris-versicolor 6.3,3.3,4.7,1.6,Iris-versicolor 4.9,2.4,3.3,1.0,Iris-versicolor 6.6,2.9,4.6,1.3,Iris-versicolor 5.2,2.7,3.9,1.4,Iris-versicolor 5.0,2.0,3.5,1.0,Iris-versicolor 5.9,3.0,4.2,1.5,Iris-versicolor 6.0,2.2,4.0,1.0,Iris-versicolor 6.1,2.9,4.7,1.4,Iris-versicolor 5.6,2.9,3.6,1.3,Iris-versicolor 6.7,3.1,4.4,1.4,Iris-versicolor 5.6,3.0,4.5,1.5,Iris-versicolor 5.8,2.7,4.1,1.0,Iris-versicolor 6.2,2.2,4.5,1.5,Iris-versicolor 5.6,2.5,3.9,1.1,Iris-versicolor 5.9,3.2,4.8,1.8,Iris-versicolor 6.1,2.8,4.0,1.3,Iris-versicolor 6.3,2.5,4.9,1.5,Iris-versicolor 6.1,2.8,4.7,1.2,Iris-versicolor 6.4,2.9,4.3,1.3,Iris-versicolor 6.6,3.0,4.4,1.4,Iris-versicolor 6.8,2.8,4.8,1.4,Iris-versicolor 6.7,3.0,5.0,1.7,Iris-versicolor 6.0,2.9,4.5,1.5,Iris-versicolor 5.7,2.6,3.5,1.0,Iris-versicolor 5.5,2.4,3.8,1.1,Iris-versicolor 5.5,2.4,3.7,1.0,Iris-versicolor 5.8,2.7,3.9,1.2,Iris-versicolor 6.0,2.7,5.1,1.6,Iris-versicolor 5.4,3.0,4.5,1.5,Iris-versicolor 6.0,3.4,4.5,1.6,Iris-versicolor 6.7,3.1,4.7,1.5,Iris-versicolor 6.3,2.3,4.4,1.3,Iris-versicolor 5.6,3.0,4.1,1.3,Iris-versicolor 5.5,2.5,4.0,1.3,Iris-versicolor 5.5,2.6,4.4,1.2,Iris-versicolor 6.1,3.0,4.6,1.4,Iris-versicolor 5.8,2.6,4.0,1.2,Iris-versicolor 5.0,2.3,3.3,1.0,Iris-versicolor 5.6,2.7,4.2,1.3,Iris-versicolor 5.7,3.0,4.2,1.2,Iris-versicolor 5.7,2.9,4.2,1.3,Iris-versicolor 6.2,2.9,4.3,1.3,Iris-versicolor 5.1,2.5,3.0,1.1,Iris-versicolor 5.7,2.8,4.1,1.3,Iris-versicolor 6.3,3.3,6.0,2.5,Iris-virginica 5.8,2.7,5.1,1.9,Iris-virginica 7.1,3.0,5.9,2.1,Iris-virginica 6.3,2.9,5.6,1.8,Iris-virginica 6.5,3.0,5.8,2.2,Iris-virginica 7.6,3.0,6.6,2.1,Iris-virginica 4.9,2.5,4.5,1.7,Iris-virginica 7.3,2.9,6.3,1.8,Iris-virginica 6.7,2.5,5.8,1.8,Iris-virginica 7.2,3.6,6.1,2.5,Iris-virginica 6.5,3.2,5.1,2.0,Iris-virginica 6.4,2.7,5.3,1.9,Iris-virginica 6.8,3.0,5.5,2.1,Iris-virginica 5.7,2.5,5.0,2.0,Iris-virginica 5.8,2.8,5.1,2.4,Iris-virginica 6.4,3.2,5.3,2.3,Iris-virginica 6.5,3.0,5.5,1.8,Iris-virginica 7.7,3.8,6.7,2.2,Iris-virginica 7.7,2.6,6.9,2.3,Iris-virginica 6.0,2.2,5.0,1.5,Iris-virginica 6.9,3.2,5.7,2.3,Iris-virginica 5.6,2.8,4.9,2.0,Iris-virginica 7.7,2.8,6.7,2.0,Iris-virginica 6.3,2.7,4.9,1.8,Iris-virginica 6.7,3.3,5.7,2.1,Iris-virginica 7.2,3.2,6.0,1.8,Iris-virginica 6.2,2.8,4.8,1.8,Iris-virginica 6.1,3.0,4.9,1.8,Iris-virginica 6.4,2.8,5.6,2.1,Iris-virginica 7.2,3.0,5.8,1.6,Iris-virginica 7.4,2.8,6.1,1.9,Iris-virginica 7.9,3.8,6.4,2.0,Iris-virginica 6.4,2.8,5.6,2.2,Iris-virginica 6.3,2.8,5.1,1.5,Iris-virginica 6.1,2.6,5.6,1.4,Iris-virginica 7.7,3.0,6.1,2.3,Iris-virginica 6.3,3.4,5.6,2.4,Iris-virginica 6.4,3.1,5.5,1.8,Iris-virginica 6.0,3.0,4.8,1.8,Iris-virginica 6.9,3.1,5.4,2.1,Iris-virginica 6.7,3.1,5.6,2.4,Iris-virginica 6.9,3.1,5.1,2.3,Iris-virginica 5.8,2.7,5.1,1.9,Iris-virginica 6.8,3.2,5.9,2.3,Iris-virginica 6.7,3.3,5.7,2.5,Iris-virginica 6.7,3.0,5.2,2.3,Iris-virginica 6.3,2.5,5.0,1.9,Iris-virginica 6.5,3.0,5.2,2.0,Iris-virginica 6.2,3.4,5.4,2.3,Iris-virginica 5.9,3.0,5.1,1.8,Iris-virginica
将,替换为空格,具体的类名称替换为0,1,2
5.1 3.5 1.4 0.2 0 4.9 3.0 1.4 0.2 0 4.7 3.2 1.3 0.2 0 4.6 3.1 1.5 0.2 0 5.0 3.6 1.4 0.2 0 5.4 3.9 1.7 0.4 0 4.6 3.4 1.4 0.3 0 5.0 3.4 1.5 0.2 0 4.4 2.9 1.4 0.2 0 4.9 3.1 1.5 0.1 0 5.4 3.7 1.5 0.2 0 4.8 3.4 1.6 0.2 0 4.8 3.0 1.4 0.1 0 4.3 3.0 1.1 0.1 0 5.8 4.0 1.2 0.2 0 5.7 4.4 1.5 0.4 0 5.4 3.9 1.3 0.4 0 5.1 3.5 1.4 0.3 0 5.7 3.8 1.7 0.3 0 5.1 3.8 1.5 0.3 0 5.4 3.4 1.7 0.2 0 5.1 3.7 1.5 0.4 0 4.6 3.6 1.0 0.2 0 5.1 3.3 1.7 0.5 0 4.8 3.4 1.9 0.2 0 5.0 3.0 1.6 0.2 0 5.0 3.4 1.6 0.4 0 5.2 3.5 1.5 0.2 0 5.2 3.4 1.4 0.2 0 4.7 3.2 1.6 0.2 0 4.8 3.1 1.6 0.2 0 5.4 3.4 1.5 0.4 0 5.2 4.1 1.5 0.1 0 5.5 4.2 1.4 0.2 0 4.9 3.1 1.5 0.1 0 5.0 3.2 1.2 0.2 0 5.5 3.5 1.3 0.2 0 4.9 3.1 1.5 0.1 0 4.4 3.0 1.3 0.2 0 5.1 3.4 1.5 0.2 0 5.0 3.5 1.3 0.3 0 4.5 2.3 1.3 0.3 0 4.4 3.2 1.3 0.2 0 5.0 3.5 1.6 0.6 0 5.1 3.8 1.9 0.4 0 4.8 3.0 1.4 0.3 0 5.1 3.8 1.6 0.2 0 4.6 3.2 1.4 0.2 0 5.3 3.7 1.5 0.2 0 5.0 3.3 1.4 0.2 0 7.0 3.2 4.7 1.4 1 6.4 3.2 4.5 1.5 1 6.9 3.1 4.9 1.5 1 5.5 2.3 4.0 1.3 1 6.5 2.8 4.6 1.5 1 5.7 2.8 4.5 1.3 1 6.3 3.3 4.7 1.6 1 4.9 2.4 3.3 1.0 1 6.6 2.9 4.6 1.3 1 5.2 2.7 3.9 1.4 1 5.0 2.0 3.5 1.0 1 5.9 3.0 4.2 1.5 1 6.0 2.2 4.0 1.0 1 6.1 2.9 4.7 1.4 1 5.6 2.9 3.6 1.3 1 6.7 3.1 4.4 1.4 1 5.6 3.0 4.5 1.5 1 5.8 2.7 4.1 1.0 1 6.2 2.2 4.5 1.5 1 5.6 2.5 3.9 1.1 1 5.9 3.2 4.8 1.8 1 6.1 2.8 4.0 1.3 1 6.3 2.5 4.9 1.5 1 6.1 2.8 4.7 1.2 1 6.4 2.9 4.3 1.3 1 6.6 3.0 4.4 1.4 1 6.8 2.8 4.8 1.4 1 6.7 3.0 5.0 1.7 1 6.0 2.9 4.5 1.5 1 5.7 2.6 3.5 1.0 1 5.5 2.4 3.8 1.1 1 5.5 2.4 3.7 1.0 1 5.8 2.7 3.9 1.2 1 6.0 2.7 5.1 1.6 1 5.4 3.0 4.5 1.5 1 6.0 3.4 4.5 1.6 1 6.7 3.1 4.7 1.5 1 6.3 2.3 4.4 1.3 1 5.6 3.0 4.1 1.3 1 5.5 2.5 4.0 1.3 1 5.5 2.6 4.4 1.2 1 6.1 3.0 4.6 1.4 1 5.8 2.6 4.0 1.2 1 5.0 2.3 3.3 1.0 1 5.6 2.7 4.2 1.3 1 5.7 3.0 4.2 1.2 1 5.7 2.9 4.2 1.3 1 6.2 2.9 4.3 1.3 1 5.1 2.5 3.0 1.1 1 5.7 2.8 4.1 1.3 1 6.3 3.3 6.0 2.5 2 5.8 2.7 5.1 1.9 2 7.1 3.0 5.9 2.1 2 6.3 2.9 5.6 1.8 2 6.5 3.0 5.8 2.2 2 7.6 3.0 6.6 2.1 2 4.9 2.5 4.5 1.7 2 7.3 2.9 6.3 1.8 2 6.7 2.5 5.8 1.8 2 7.2 3.6 6.1 2.5 2 6.5 3.2 5.1 2.0 2 6.4 2.7 5.3 1.9 2 6.8 3.0 5.5 2.1 2 5.7 2.5 5.0 2.0 2 5.8 2.8 5.1 2.4 2 6.4 3.2 5.3 2.3 2 6.5 3.0 5.5 1.8 2 7.7 3.8 6.7 2.2 2 7.7 2.6 6.9 2.3 2 6.0 2.2 5.0 1.5 2 6.9 3.2 5.7 2.3 2 5.6 2.8 4.9 2.0 2 7.7 2.8 6.7 2.0 2 6.3 2.7 4.9 1.8 2 6.7 3.3 5.7 2.1 2 7.2 3.2 6.0 1.8 2 6.2 2.8 4.8 1.8 2 6.1 3.0 4.9 1.8 2 6.4 2.8 5.6 2.1 2 7.2 3.0 5.8 1.6 2 7.4 2.8 6.1 1.9 2 7.9 3.8 6.4 2.0 2 6.4 2.8 5.6 2.2 2 6.3 2.8 5.1 1.5 2 6.1 2.6 5.6 1.4 2 7.7 3.0 6.1 2.3 2 6.3 3.4 5.6 2.4 2 6.4 3.1 5.5 1.8 2 6.0 3.0 4.8 1.8 2 6.9 3.1 5.4 2.1 2 6.7 3.1 5.6 2.4 2 6.9 3.1 5.1 2.3 2 5.8 2.7 5.1 1.9 2 6.8 3.2 5.9 2.3 2 6.7 3.3 5.7 2.5 2 6.7 3.0 5.2 2.3 2 6.3 2.5 5.0 1.9 2 6.5 3.0 5.2 2.0 2 6.2 3.4 5.4 2.3 2 5.9 3.0 5.1 1.8 2
书写调用程序为
#include "StdAfx.h" #include <fstream> #include <vector> #include<iostream> using namespace std; //打印vector void vec_print(vector<float> a){ for(int i=0;i<a.size();i++){ printf("%f ",a.at(i)); } } //打印二维数组 void arr2_print(float (*a)[5],int m){ for(int i=0;i<m;i++){ for(int j=0;j<5;j++){ cout<<a[i][j]<<" "<<endl; } cout<<endl; } } int main(){ ifstream fin("iris.txt"); vector<float> vec; while (!fin.eof()){ float idata; fin >> idata; vec.push_back(idata); } vec_print(vec); //分割总的data为 trainVec和valVec,生成两个一维向量 //40行*5列=200 vector<float> trainVec; vector<float> valVec; for(int i=0;i<vec.size();i++){ if(i<=200 || (i>250 && i<=450) || (i>500 && i<=700) ){ trainVec.push_back(vec.at(i)); }else{ valVec.push_back(vec.at(i)); } } //一维向量变成二维数组 trainArr和valArr float trainArr[120][5]; float valArr[30][5]; for(int i=0;i<trainVec.size()/5;i++){ for(int j=0;j<5;j++){ trainArr[i][j] = trainVec.at(5*i+j); } } for(int i=0;i<valVec.size()/5;i++){ for(int j=0;j<5;j++){ valArr[i][j] =valVec.at(5*i+j); } } arr2_print(trainArr,120); getchar(); return 0; }python的调用代码则比较简单了,用知乎上的话来说,如何优雅地写出代码
a = np.loadtxt('iris.txt') train_data = np.concatenate((a[0:40],a[50:90],a[100:140]),axis=0) test_data = np.concatenate((a[40:50],a[90:100],a[140:150]),axis=0)
[2] 下载地址
[3] 读取文件到数组(有错误已在代码中修正)
[4] numpy数组拼接介绍