HPC- cuda install

It happens to use the very latest production release on NVIDIA cuda toolkit website.

 

part 1 install CUDA driver  

 refLink http://superuser.com/questions/484991/nvidia-graphics-driver-in-ubuntu-12-04

1. Blacklist

Especially, blacklist nouveau works for me to remove bothering i2c warnings.

2. stop lightdm

ubuntu has switched from gdm to lightdm, so you have to stop this X session to install the driver.

two ways(both use super-root):  service lightdm stop or stop lightdm

3. normal setup

I choose all the default setup directories

 

part 2 Make all for samples

 Before using 5.0.35, which is the latest one, I firstly installed 5.0.24, which is the release candidate. The compilation experience for all samples is good, but ./deviceQuery failed to show any output. Also, with 5.0.25, you do not need to do sudo ldconfig /usr/local/cuda/lib, we all talk about it later.

 

Compilation under 5.0.35

1. mpi install 

 When compiling simplempi, the *** mpi not found error popped out, so you have to install some mpi package.

 Solution, ref linkhttp://cs.ucsb.edu/~hnielsen/cs140/openmpi-install.html
 

Debian and Ubuntu

These instructions will almost definitely work on Debian lenny, squeeze, and sid, as well as Ubuntu hardy, intrepid, jaunty, karmic, or lucid.
Make sure your package repository is up to date. apt-get update will do this. You must run this command as root - you may have to su, or more likely run it with sudo (it'll look like sudo apt-get update).
Be sure you've installed GCC! apt-get install gcc g++ will install the compilers if you don't have them already.
Then, run apt-get install openmpi-bin openmpi-doc libopenmpi-dev, wrapping the command in sudo if necessary. This will install OpenMPI, all necessary libraries, and the documentation for the MPI calls.

 2. libcublas.so: error: undefined reference to 'dlsym'

   ref link: http://forum.luahub.com/index.php?topic=2390.0

   This error happens around when compiling /samples/6_Advanced/cdpLUDecomposition, I choose to trace within makefile, and apply -ldl compiling option, as depicted in the ref post. It did get around the compilation error.

 

3. ./deviceQuery: error while loading shared libraries: libcudart.so.5.0: cannot open shared object file: No such file or directory

  When exe deviceQuery file, the error popped out.

  Solution, ref link http://stackoverflow.com/questions/10808958/libcudart-so-4-cannot-find-ubuntu-10-04
  The post is pretty good reference. It said, LD_LIBRARY_PATH would mess up between diff programs. The following is the full explanation:

LD_LIBRARY_PATH is strongly deprecated. It may mess up other programs, and others may reset it. It should only be used to temporarily override the permanent paths for testing purposes (don't take my word, google it).

Instead, add a line with your cuda lib directory on it to /etc/ld.so.conf, after any existing lines.

For example, if you installed on /usr/local/cuda, you will need to add

32-bit : /usr/local/cuda/lib

64-bit : /usr/local/cuda/lib64

Save, and run ldconfig. This should permanently fix the problem.

The symbolic links are probably already set up by the installation. If not, then add them as Alex advised.

Note - I received errors referencing /lib, but I needed to add lib64 to fix them. 

 

 

Final result: 

root@rui:/usr/local/cuda-5.0/samples/bin/linux/release# ./deviceQuery

./deviceQuery Starting...
 CUDA Device Query (Runtime API) version (CUDART static linking)
Detected 1 CUDA Capable device(s)
Device 0: "NVS 5400M"
  CUDA Driver Version / Runtime Version          5.0 / 5.0
  CUDA Capability Major/Minor version number:    2.1
  Total amount of global memory:                 1024 MBytes (1073414144 bytes)
  ( 2) Multiprocessors x ( 48) CUDA Cores/MP:    96 CUDA Cores
  GPU Clock rate:                                950 MHz (0.95 GHz)
  Memory Clock rate:                             900 Mhz
  Memory Bus Width:                              128-bit
  L2 Cache Size:                                 131072 bytes
  Max Texture Dimension Size (x,y,z)             1D=(65536), 2D=(65536,65535), 3D=(2048,2048,2048)
  Max Layered Texture Size (dim) x layers        1D=(16384) x 2048, 2D=(16384,16384) x 2048
  Total amount of constant memory:               65536 bytes
  Total amount of shared memory per block:       49152 bytes
  Total number of registers available per block: 32768
  Warp size:                                     32
  Maximum number of threads per multiprocessor:  1536
  Maximum number of threads per block:           1024
  Maximum sizes of each dimension of a block:    1024 x 1024 x 64
  Maximum sizes of each dimension of a grid:     65535 x 65535 x 65535
  Maximum memory pitch:                          2147483647 bytes
  Texture alignment:                             512 bytes
  Concurrent copy and kernel execution:          Yes with 1 copy engine(s)
  Run time limit on kernels:                     No
  Integrated GPU sharing Host Memory:            No
  Support host page-locked memory mapping:       Yes
  Alignment requirement for Surfaces:            Yes
  Device has ECC support:                        Disabled
  Device supports Unified Addressing (UVA):      No
  Device PCI Bus ID / PCI location ID:           1 / 0
  Compute Mode:
     < Default (multiple host threads can use ::cudaSetDevice() with device simultaneously) >
deviceQuery, CUDA Driver = CUDART, CUDA Driver Version = 5.0, CUDA Runtime Version = 5.0, NumDevs = 1, Device0 = NVS 5400M

posted on 2012-10-16 01:43  单向度的人  阅读(1165)  评论(0编辑  收藏  举报

导航