Ubuntu为julia安装深度学习框架MXNet(支持CUDA和OPenCV编译)
环境介绍与注意事项
安装MXNet的julia绑定,经过多次测试,并不能简单的通过Pkg.add("MXNet")
进行安装。
会报此错误
error building `mxnet`: │
[ info: found nvcc: /usr/local/cuda-10.1/bin/nvcc │
ERROR: MethodError: no method matching replace(::String, ::Pair{String,String}, ::Pair{String,String})
...
经过查看源代码,问题出在下载编译时,~/.julia/packages/MXNet/XoVcx/deps/build.jl
(默认软件包安装位置)文件中replace语法问题:
if Sys.isunix()
nvcc_path = Sys.which("nvcc")
if nvcc_path ≢ nothing
@info "Found nvcc: $nvcc_path"
push!(CUDAPATHS, replace(nvcc_path, "bin/nvcc", "lib64"))
end
end
此处replace方法语法出现问题(此问题同样出现在采用某些版本的官方安装教程中,在那些教程中,采用了将编译后的动态链接文件用于预编译,然后使用Pkg
安装,但是同样会有此问题),在正常安装的版本中,此处应为:
if Sys.isunix()
nvcc_path = Sys.which("nvcc")
if nvcc_path ≢ nothing
@info "Found nvcc: $nvcc_path"
push!(CUDAPATHS, replace(nvcc_path, "bin/nvcc" => "lib64"))
end
end
因此,采用Pkg
安装并不能通过编译,只能采用手动编译的方式。
我的julia安装目录,~/julia/julia-1.5.3
中。
下载源文件
在官方网站下载MXNet1.6安装包。
安装过程主要参考此官方文档。
安装过程虽然参考MXNet1.7,但是若实际安装MXNet1.7,编译并不能通过,可能和CUDA版本相关。
安装依赖
$ sudo apt-get update
$ sudo apt-get install -y build-essential git ninja-build ccache libopenblas-dev libopencv-dev cmake
libopencv-dev
是可选的,为了安装OpenCV支持。
编译
- 解压并更名
$ tar -xzvf apache-mxnet-src-1.6.0-incubating.tar.gz
$ mv apache-mxnet-src-1.6.0-incubating incubatormxnet
- 移动压缩后文件,并进入解压文件夹
$ mv incubatormxnet ~/julia/incubatormxnet
$ cd ~/julia/incubatormxnet
- 创建编译文件夹build
$ rm -rf build
$ mkdir build && cd build
若不需要安装GPU
支持,或OpenCV
支持,则在在编译配置文件中(~/julia/incubatormxnet/CMakeLists.txt
),将USE_CUDA
以及USE_OPENCV
设为OFF
(默认为ON)。
#默认为ON
option(USE_CUDA "Build with CUDA support" ON)
...
option(USE_OPENCV "Build with OpenCV support" ON)
- 编译
$ cmake ..
NOTE
:cmake时可能出现CMake 3.13 or higher is required. You are running version 3.10.2
,此时需要升级CMake版本
$ pip3 install --user --upgrade "cmake>=3.13.2"
若未安装pip,运行以下命令进行安装:
$ sudo apt-get install -y python3-pip
安装后重新执行:
$ cmake ..
#或者执行
$ ~/.local/bin/cmake ..
- make
#根据核心数,增加执行速度
$ make -j8
NOTE
:编译过程中可能出现此错误:
FAILED: CMakeFiles/cuda_compile_1.dir/src/operator/contrib/cuda_compile_1_generated_bounding_box.cu.o
cd ~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib && /usr/local/bin/cmake -E make_directory ~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/. && /usr/local/bin/cmake -D verbose:BOOL=OFF -D build_configuration:STRING=Debug -D generated_file:STRING=~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o -D generated_cubin_file:STRING=~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o.cubin.txt -P ~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/cuda_compile_1_generated_bounding_box.cu.o.Debug.cmake
~/julia/incubatormxnet/include/dmlc/./thread_local.h: In instantiation of ‘static T* dmlc::ThreadLocalStore<T>::Get() [with T = std::unordered_set<std::__cxx11::basic_string<char> >]’:
~/julia/incubatormxnet/src/operator/contrib/./../../common/utils.h:461:28: required from here
~/julia/incubatormxnet/include/dmlc/./thread_local.h:46:15: error: cannot call member function ‘void dmlc::ThreadLocalStore<T>::RegisterDelete(T*) [with T = std::unordered_set<std::__cxx11::basic_string<char> >]’ without object
Singleton()->RegisterDelete(ptr);
~~~~~~~~^~~~~
CMake Error at cuda_compile_1_generated_bounding_box.cu.o.Debug.cmake:279 (message):
Error generating file
~/julia/incubatormxnet/build/CMakeFiles/cuda_compile_1.dir/src/operator/contrib/./cuda_compile_1_generated_bounding_box.cu.o
此时同样需要修改源文件~/julia/incubatormxnet/include/dmlc/thread_local.h
将
Singleton()->RegisterDelete(ptr);
修改为
(*Singleton()).RegisterDelete(ptr);
修改完成后,重新构建build
文件夹,并进行编译。
$ cd ..
$ rm -rf build
$ mkdir build && cd build
...
- 将编译完成后文件复制到可被
MXNET
定位的文件夹
libmxnet安装的路径应为libmxnet的根目录。 换句话说,应该可以在$MXNET_HOME/lib下找到编译后的libmxnet.so文件。 如libmxnet的根目录是 ~/julia/incubatormxnet,则应运行以下命令:
$ cd ~/julia/incubatormxnet
$ cp -r build lib
环境配置
编译完成后,修改配置文件。
#创建安装目录
$ mkdir ~/julia/julia-1.5.3/julia-depot
$ vim ~/.bashrc
NOTE
,此处需要创建安装目录的原因是:使用默认安装位置~/.julia
可能导致Permission denied
错误.
在文件~/.bashrc
后添加如下行
export MXNET_HOME="$HOME/Program/julia/incubatormxnet"
export LD_LIBRARY_PATH="$HOME/julia/incubatormxnet/lib:$LD_LIBRARY_PATH"
export JULIA_DEPOT_PATH="$HOME/julia/julia-1.5.3/julia-depot"
使用软链接将julia添加进系统执行路径
$ sudo ln -s ~/julia/julia-1.5.3/julia/bin/julia /usr/bin/julia
安装MXNet
julia --color=yes --project=./ -e \
'using Pkg; \
Pkg.develop(PackageSpec(name="MXNet", path = joinpath(ENV["MXNET_HOME"], "julia")))'
为了使安装包在系统路径中进行注册,需要在路径~/julia/julia1.5.3/julia-depot/environments/v1.5
中再次执行上述命令:
$ cd ~/julia/julia1.5.3/julia-depot/environments/v1.5
$ julia --color=yes --project=./ -e \
'using Pkg; \
Pkg.develop(PackageSpec(name="MXNet", path = joinpath(ENV["MXNET_HOME"], "julia")))'
此时查看~/julia/julia1.5.3/julia-depot/environments/v1.5/Manifest.toml
可以看到MXNet被正确定位:
[[MXNet]]
deps = ["BinDeps", "Formatting", "JSON", "Libdl", "LinearAlgebra", "MacroTools", "Markdown", "Printf", "Random", "Reexport", "Statistics"]
path = "/home/brainiac/julia/incubatormxnet/julia"
uuid = "a7949054-b901-59c6-b8e3-7238c29bf7f0"
version = "1.6.0"
如果上述过程没有问题,就安装成功了,但是可能出现仍找不到MNXet的情况,这种情况下首先检查各个文件夹的对应情况,然后检查~/julia/incubatormxnet/julia/src/base.jl
查看动态库加载是是否正常
const MXNET_LIB = Libdl.find_library(["libmxnet.$(Libdl.dlext)", "libmxnet.so"], # see build.jl
[joinpath(get(ENV, "MXNET_HOME", ""), "lib"),
get(ENV, "MXNET_HOME", ""),
joinpath(@__DIR__, "..",
"deps", "usr", "lib")])
#添加如下行后重新执行
print(Libdl.find_library("~/julia/incubatormxnet/lib/libmxnet.so"))
或
$ julia
julia> using Libdl
julia> Libdl.find_library("~/julia/incubatormxnet/lib/libmxnet.so")
若无法打印,可能是编译过程出现问题,重新编译执行,可以解决。
测试
$ julia
julia> using MXNet
julia> mx.gpu()
GPU0
后记
这个安装真的是费劲,enjoy coding.