HPC- cuda install
part 1 install CUDA driver
refLink http://superuser.com/questions/484991/nvidia-graphics-driver-in-ubuntu-12-04
1. Blacklist
Especially, blacklist nouveau works for me to remove bothering i2c warnings.
2. stop lightdm
ubuntu has switched from gdm to lightdm, so you have to stop this X session to install the driver.
two ways(both use super-root): service lightdm stop or stop lightdm
3. normal setup
I choose all the default setup directories
part 2 Make all for samples
Before using 5.0.35, which is the latest one, I firstly installed 5.0.24, which is the release candidate. The compilation experience for all samples is good, but ./deviceQuery failed to show any output. Also, with 5.0.25, you do not need to do sudo ldconfig /usr/local/cuda/lib, we all talk about it later.
Compilation under 5.0.35
1. mpi install
When compiling simplempi, the *** mpi not found error popped out, so you have to install some mpi package.
Solution, ref linkhttp://cs.ucsb.edu/~hnielsen/cs140/openmpi-install.html
Debian and Ubuntu
These instructions will almost definitely work on Debian lenny, squeeze, and sid, as well as Ubuntu hardy, intrepid, jaunty, karmic, or lucid.Make sure your package repository is up to date. apt-get update will do this. You must run this command as root - you may have to su, or more likely run it with sudo (it'll look like sudo apt-get update).Be sure you've installed GCC! apt-get install gcc g++ will install the compilers if you don't have them already.Then, run apt-get install openmpi-bin openmpi-doc libopenmpi-dev, wrapping the command in sudo if necessary. This will install OpenMPI, all necessary libraries, and the documentation for the MPI calls.
2. libcublas.so: error: undefined reference to 'dlsym'
ref link: http://forum.luahub.com/index.php?topic=2390.0
This error happens around when compiling /samples/6_Advanced/cdpLUDecomposition, I choose to trace within makefile, and apply -ldl compiling option, as depicted in the ref post. It did get around the compilation error.
3. ./deviceQuery: error while loading shared libraries: libcudart.so.5.0: cannot open shared object file: No such file or directory
When exe deviceQuery file, the error popped out.
Solution, ref link http://stackoverflow.com/questions/10808958/libcudart-so-4-cannot-find-ubuntu-10-04
The post is pretty good reference. It said, LD_LIBRARY_PATH would mess up between diff programs. The following is the full explanation:
LD_LIBRARY_PATH is strongly deprecated. It may mess up other programs, and others may reset it. It should only be used to temporarily override the permanent paths for testing purposes (don't take my word, google it).
Instead, add a line with your cuda lib directory on it to /etc/ld.so.conf, after any existing lines.
For example, if you installed on /usr/local/cuda, you will need to add
32-bit : /usr/local/cuda/lib
64-bit : /usr/local/cuda/lib64
Save, and run ldconfig. This should permanently fix the problem.
The symbolic links are probably already set up by the installation. If not, then add them as Alex advised.
Note - I received errors referencing /lib, but I needed to add lib64 to fix them.
Final result:
root@rui:/usr/local/cuda-5.0/samples/bin/linux/release# ./deviceQuery