Preparing your environment
Follow the installation requirements for running a Daemon on your machine
1/ BIOS settings
First, we want to ensure that our machine can run Virtualization containers, such as Docker. This is setup in the BIOS, and you will only have to do it once.
2/ Ubuntu install
If you have a windows machine, and want to make use of your GPU processing power for using AIs, there are a few pre-requirements to achieve, since NVIDIA has only made their GPU available via Virtualization (Docker) on Linux, and not on Windows.
Your Daemon can run on as many machines as you like. On each such machine, it requires a Linux/Ubuntu configuration.
If you are already on Linux, just make sure your installation has the requirements in the list below.
If you are on Windows, make sure you are running WSL 2 and Docker desktop configured for WSL.
Windows 10 and 11 can run good Linux compatible machines, so we will use this capability.
Your main goal is to install wsl2 on your Windows machine (wsl --install), then install a Ubuntu 22.04 instance (wsl --install -d Ubuntu-22.04), and finally, install Docker on this wsl session.
Follow the steps to install a wsl2 in the documentation below
Make sure that your default wsl configuration is set to the Ubuntu config that you just installed. (wsl -- setdefault Ubuntu-22.04) so that you end up on the Ubuntu server when you go into "wsl".
1.3/ Docker install
We will then need to install Docker. It is preferred to install Docker via the windows Application "Docker Desktop", and activate the Ubuntu config from there. In case you would like to setup docker directly in Ubuntu, here is a good link to do this:

If you do not have this and docker is still not properly installed, it is very recommended to review the Windows "Docker Desktop" settings, and set it up from there. For this, run "Docker Desktop", and go into Settings > Resources. Here, make sure to force the toggle "Enable integration with additional distros" to ON, then relaunch Docker Desktop, and relaunch a wsl session.

1.4/ Installing NVIDIA CUDA
You will need CUDA installed on your machine. Check the latest package on the following link.
1.5/ Checking the config
A very good way to test if the whole installation was successful is to run the following command inside a wsl terminal.
docker run --gpus all nvcr.io/nvidia/k8s/cuda-sample:nbody nbody -gpu -benchmark
This will validate the following:
When successfully running this command, the output will be as per below. As you can see the GPU (here a RTX 3090) was detected and accessible.
Unable to find image 'nvcr.io/nvidia/k8s/cuda-sample:nbody' locally
nbody: Pulling from nvidia/k8s/cuda-sample
22c5ef60a68e: Pull complete
1939e4248814: Pull complete
548afb82c856: Pull complete
a424d45fd86f: Pull complete
207b64ab7ce6: Pull complete
f65423f1b49b: Pull complete
2b60900a3ea5: Pull complete
e9bff09d04df: Pull complete
edc14edf1b04: Pull complete
1f37f461c076: Pull complete
9026fb14bf88: Pull complete
Digest: sha256:59261e419d6d48a772aad5bb213f9f1588fcdb042b115ceb7166c89a51f03363
Status: Downloaded newer image for nvcr.io/nvidia/k8s/cuda-sample:nbody
Run "nbody -benchmark [-numbodies=<numBodies>]" to measure performance.
-fullscreen (run n-body simulation in fullscreen mode)
-fp64 (use double precision floating point values for simulation)
-hostmem (stores simulation data in host memory)
-benchmark (run benchmark to measure performance)
-numbodies=<N> (number of bodies (>= 1) to run in simulation)
-device=<d> (where d=0,1,2.... for the CUDA device to use)
-numdevices=<i> (where i=(number of CUDA devices > 0) to use for simulation)
-compare (compares simulation results running once on the default GPU and once on the CPU)
-cpu (run n-body simulation on the CPU)
-tipsy=<file.bin> (load a tipsy model file for simulation)
NOTE: The CUDA Samples are not meant for performance measurements. Results may vary when GPU Boost is enabled.
> Windowed mode
> Simulation data stored in video memory
> Single precision floating point simulation
> 1 Devices used for simulation
GPU Device 0: "Ampere" with compute capability 8.6
> Compute 8.6 CUDA device: [NVIDIA GeForce RTX 3090]
83968 bodies, total time for 10 iterations: 74.207 ms
= 950.126 billion interactions per second
= 19002.529 single-precision GFLOP/s at 20 flops per interaction
In case of problem during the setup, you can check all steps again one by one. There is also this alternative step by step guide, however it is slightly more technical to follow.
Last updated