这段时间依然跟着denny402博主的博客学习caffe,由于我有一个数据集需要转换成lmdb,因为这个数据集有20个文件夹,但是我自己不会写程序于是就想用传说很好用的digits来做。
按照博主的教程安装好了打不开。。。最后还是师兄帮我弄好的。。我也不知道问题出在哪。。可能是因为我把它安装在服务器的公共目录下而不是自己的目录下。。
然后我就开森的想运行一下mnist,但是!到create models这一步就哭了。。。错误如下:
然而作为一只小白我只知道这个意思是说我的驱动和cuda版本不匹配。。。并不知道要怎么解决。。
在百度上搜了一天基本上都是说要查看一下driver版本然后不够就升级之类的,可我只知道用nvidia-smi来查看。。并不知道要怎么升级,所以昨天一天啥也没解决、55555
今天去问一下师姐应该怎么弄,她说我们服务器上好多人装过不同版本的cuda,有7.5的有8.0的,我改一下bashrc里的cuda版本就好了
于是我又哼哧哼哧的搜怎么看bashrc,命令如下:
gedit ~/.bashrc
然而我的bashrc居然啥都没有。。。再去问师姐说要自己写。。。于是我就把cuda的版本和路径写进去了:(今天才知道有插入代码功能,哭,自己智商捉急)
保存退出后,更新配置:
1 sudo lsconfig
接着我再次运行digits:
1 ./digits-devserver --config
注意要在digits的根目录下运行哦~
依然错误。我猜是digits还没有用我指定的cuda版本.
于是我又把caffe 的makefile.config里的cuda后加了-7.5这个版本号,保存退出.
继续搜digits教程,有一个教程写到可以自定义配置启动digits.
关掉digits终端,试图自定义启动,然而它提示:
1 socket.error: [Errno 98] Address already in use: ('0.0.0.0', 5000)
于是我又去搜如何KILL掉进程。。最终方法如下:
1 sudo su #进入root权限 2 lsof -i:5000 #5000是我的端口号 3 kill -9 <pid> #pid指进程号
成功自定义启动,每一步它都会给你选择,大部分我是选择默认,但有一个caffe路径我写的自己账户下的caffe路径,然后再次打开digits,此时奇迹出现了!!!
1 bx@HPC:~/digits$ ./digits-devserver --config 2 ================================ Jobs Directory ================================ 3 Where would you like to store job data? 4 5 Suggested values: 6 (*) [Previous] /home/bx/digits/digits/jobs 7 (D) [default] /home/bx/digits/digits/jobs 8 >> D 9 Using "/home/bx/digits/digits/jobs" 10 11 ===================================== GPUs ===================================== 12 Attached devices: 13 Device #0: 14 Name Tesla K40c 15 Compute capability 3.5 16 Memory 11.25 GB 17 Multiprocessors 15 18 19 Device #1: 20 Name GeForce GT 730 21 Compute capability 3.5 22 Memory 1022.52 MB 23 Multiprocessors 2 24 25 26 Input the IDs of the devices you would like to use, separated by commas, in order of preference. 27 28 Suggested values: 29 (*) [Previous] 0,1 30 (D) [default] 0,1 31 (N) [none] <NONE> 32 >> D 33 Using "0,1" 34 35 =================================== Log File =================================== 36 Where do you want the log files to be stored? 37 38 Suggested values: 39 (*) [Previous] /home/bx/digits/D* 40 (D) [default] /home/bx/digits/digits/digits.log 41 (N) [none] <NONE> 42 >> D 43 Using "/home/bx/digits/digits/digits.log" 44 45 =============================== Data extensions ================================ 46 Available extensions: 47 ID='image-gradients' Title='Gradients' 48 ID='image-object-detection' Title='Object Detection' 49 50 Input the IDs of the extensions you would like to use, separated by commas. 51 52 Suggested values: 53 (*) [Previous] image-gradients,image-object-detection 54 (D) [default] image-object-detection 55 (A) [all] image-gradients,image-object-detection 56 (N) [none] <NONE> 57 >> A 58 Using "image-gradients,image-object-detection" 59 60 =============================== View extensions ================================ 61 Available extensions: 62 ID='image-bounding-boxes' Title='Bounding boxes' 63 ID='image-gradients' Title='Gradients' 64 ID='all-raw-data' Title='Raw Data' 65 66 Input the IDs of the extensions you would like to use, separated by commas. 67 68 Suggested values: 69 (*) [Previous] image-bounding-boxes,image-gradients,all-raw-data 70 (D) [default] image-bounding-boxes,all-raw-data 71 (A) [all] image-bounding-boxes,image-gradients,all-raw-data 72 (N) [none] <NONE> 73 >> A 74 Using "image-bounding-boxes,image-gradients,all-raw-data" 75 76 ==================================== Caffe ===================================== 77 Where is caffe installed? 78 79 Suggested values: 80 (*) [Previous] /home/bx/caffe 81 (P) [PATH/PYTHONPATH] <PATHS> 82 >> /home/bx/caffe 83 Using "/home/bx/caffe" 84 85 ==================================== Torch ===================================== 86 Where is torch installed? 87 88 Suggested values: 89 (*) [Previous] <PATHS> 90 (P) [PATH/TORCHPATH] <PATHS> 91 (N) [none] <NONE> 92 >> P 93 Using "<PATHS>" 94 95 Saved config to /home/bx/digits/digits/digits.cfg 96 Couldn't import dot_parser, loading of dot files will not be possible. 97 2017-03-30 11:32:56 [INFO ] Loaded 2 jobs. 98 ___ ___ ___ ___ _____ ___ 99 | \_ _/ __|_ _|_ _/ __| 100 | |) | | (_ || | | | \__ \ 101 |___/___\___|___| |_| |___/ 4.0.0-rc.1.dev 102 103 * Running on http://0.0.0.0:5000/
mnist可以run了!bug成功解决! 我猜是我改了caffe的makefile,然后又定义digits用这个路径,之前它可能连接到服务器上别的账户去了~
终于可以用digits了,开森~
之后再有什么经验教训我会继续写下来~有跟我相同的情况的小伙伴就能避免啦~~~
这是mnist训练过程截图: