Erik Peña
Uncategorized

Install Nvidia CUDA Drivers on Ubuntu Server

I needed to install NVIDIA CUDA on my Ubuntu Server so that the CUDA toolkit was available during the build of the miner applications and also so that the CUDA drivers are installed during the running of said miners.  This document covers installing NVIDIA CUDA on Ubuntu Server v16.04.

I will cover building miners in other documents in an effort to keep this article on a “targeted focus” ;-).

Dependencies

In order to use NVIDIA’s CUDA installer, you’ll need to install the build-essentials package.  Since I am doing this on Ubuntu Server, I executed the following command to get this installed using Apt-get–

sudo apt-get install build-essential

Install CUDA

First obvious step to installing NVIDIA CUDA is to download the binaries (duh).  To do this, head over to NVIDIA’s website and select the version you need.  For the sake of this document, I chose CUDA 9.1.

https://developer.nvidia.com/cuda-toolkit-archive

Using the form, select the features of your system so that a valid installer can be downloaded.  Since I am on Ubuntu Server 16.04, I selected the following features–

As you can see in the above snapshot, I opted for the “runfile (local)” installer.  There’s no reason I chose this over the others.  Theoretically, any of them should work.  To download the file, I navigated to my user directory “cd ~” and executed the following command.

$ wget https://developer.nvidia.com/compute/cuda/9.1/Prod/local_installers/cuda_9.1.85_387.26_linux

The above uses wget to physically download the installer.  As a best practice, you should always check the checksums of the file and compare it with the publisher whenever possible to ensure that you are not downloading a compromised file.  In this case, I compared the checksum value that nvidia provided on the download page I referenced earlier and found that the file I downloaded should have a checksum value of “67a5c3933109507df6b68f80650b4b4a”.

To retrieve the checksum of the actual file, run the following command below–

$ md5sum cuda_9.1.85_387.26_linux
67a5c3933109507df6b68f80650b4b4a  cuda_9.1.85_387.26_linux

It’s a match.  So we are good to install this package.  However, before we uninstall and this only applies to readers who have another version of the cuda drivers installed, you’ll likely want to uninstall the older version before installing this version.  To uninstall, we execute the following command (skip this step if you are installing this brand new)–

$ sudo apt-get purge nvidia-*

Now we’re actually ready to install nvidia CUDA.    To kick off this process, execute the following command–

$ sudo sh cuda_9.1.85_387.26_linux --override

There will be a pause while it is loading the installer.  The next screen you will be presented with is the End User License Agreement (EULA).  Use the spacebar to scroll through the legal or (pro tip) type the “q” key to skip to the end.  Once you hit the end, a series of questions will be prompted.  The first is asking if you read and accept the End User License Agreement.  I can’t tell you what to do, but I accepted it by typing “accept” and then hitting the enter key–

Do you accept the previously read EULA?
accept/decline/quit: accept

Next, the installer will ask if you want to install the gpu drivers.  This will be needed when you actually use your miner so, you will need to say yes by typing “y” and hitting the enter key–

Install NVIDIA Accelerated Graphics Driver for Linux-x86_64 387.26?
(y)es/(n)o/(q)uit: y

The installer will then ask if you want to install the OpenGL libraries.  Admittedly, I wasn’t too sure on this one, but seeing as how installing something that isn’t going to be used wouldn’t hurt and the fact that the default answer is “yes”, I didn’t think this would be a problem.  I told the installer “yes” by typing “y” and hitting the enter key–

Do you want to install the OpenGL libraries?
(y)es/(n)o/(q)uit [ default is yes ]: 

The next question asks if you want to run the nvidia-xconfig (I don’t cover this in this document).  I do plan on allowing a mix of NVIDIA and AMD based gpus which, based on the installers instruction, means that I should not run the nvidia-xconfig.  I skipped this step by hitting the enter key since the default answer is “no”–

Do you want to run nvidia-xconfig?
This will update the system X configuration file so that the NVIDIA X driver
is used. The pre-existing X configuration file will be backed up.
This option should not be used on systems that require a custom
X configuration, such as systems with multiple GPU vendors.
(y)es/(n)o/(q)uit [ default is no ]: 

In order to compile your miner (not covered in this document), you will need the CUDA toolkit available to it.  Install this by typing “y” and hitting the enter key–

Install the CUDA 9.1 Toolkit?
(y)es/(n)o/(q)uit: y

You can use the default install location for the toolkit by just hitting the enter key for the next question–

Enter Toolkit Location
 [ default is /usr/local/cuda-9.1 ]: 

The installer will ask if you want to create a symbolic link for cuda, which you should do.  I told the installer yes by typing “y” and hitting the enter key–

Do you want to install a symbolic link at /usr/local/cuda?
(y)es/(n)o/(q)uit: y

Lastly, the installer will ask if you want to install the CUDA Samples.  This is unnecessary and unneeded.  You can skip this by typing “n” and hitting the enter key–

Install the CUDA 9.1 Samples?
(y)es/(n)o/(q)uit: n

After hitting enter, the installer will then proceed to install.  This will take a couple of minutes.  You will know it was done and done successfully when the following is output on the screen–

===========
= Summary =
===========
Driver:   Installed
Toolkit:  Installed in /usr/local/cuda-9.1
Samples:  Not Selected
...

Additional Info

For the geek factor, you can also run the NVIDIA System Management Interface command to see the status of your NVIDIA gpu(s).  Sorry for you mobile users.  It’s going to look horrible–

$ nvidia-smi
Sun Jan 14 14:03:16 2018       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 387.26                 Driver Version: 387.26                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  GeForce GTX 1080    Off  | 00000000:02:00.0 Off |                  N/A |
|  0%   43C    P0    36W / 215W |      0MiB /  8114MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
|   1  GeForce GTX 1080    Off  | 00000000:04:00.0 Off |                  N/A |
|  0%   46C    P0    35W / 215W |      0MiB /  8114MiB |      2%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|  No running processes found                                                 |
+-----------------------------------------------------------------------------+

Common Pitfalls

It appears that an X server is running

If after having tried to install the CUDA drivers, you get a summary that says something like the following–

It appears that an X server is running. Please exit X before installation. If you're sure that X is not running, but are getting this error, please delete any X lock files in /tmp.

===========
= Summary =
===========

Driver: Installation Failed
Toolkit: Installation skipped
Samples: Not Selected

You will have to stop your X Server process.  Unfortunately, the resources I had found did not have an exact science behind it.  You’ll have to run the following command multiple times until you receive a “Xorg: no process found” message.  Once you see that message, then X Server should be stopped.  Here is the command to stop X Server–

$ sudo killall Xorg

After you have stopped the X Server process, you can re-attempt to install the CUDA drivers.

 

Comments

  1. This is really helpful. Thank you for taking the time to explain this.

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.