pkgs/development/cuda-modules/README.md at master · pyrox.dev/nixpkgs

pyrox.dev / nixpkgs
lol
nixpkgs / pkgs / development / cuda-modules / README.md
at master 4.8 kB view raw view rendered
 1# CUDA Modules
 2
 3> [!NOTE]
 4> This document is meant to help CUDA maintainers understand the structure of
 5> the CUDA packages in Nixpkgs. It is not meant to be a user-facing document.
 6> For a user-facing document, see [the CUDA section of the manual](../../../doc/languages-frameworks/cuda.section.md).
 7
 8The files in this directory are added (in some way) to the `cudaPackages`
 9package set by [cuda-packages.nix](../../top-level/cuda-packages.nix).
10
11## Top-level directories
12
13- `cuda`: CUDA redistributables! Provides extension to `cudaPackages` scope.
14- `cudatoolkit`: monolithic CUDA Toolkit run-file installer. Provides extension
15    to `cudaPackages` scope.
16- `cudnn`: NVIDIA cuDNN library.
17- `cutensor`: NVIDIA cuTENSOR library.
18- `fixups`: Each file or directory (excluding `default.nix`) should contain a
19    `callPackage`-able expression to be provided to the `overrideAttrs` attribute
20    of a package produced by the generic manifest builder.
21    These fixups are applied by `pname`, so packages with multiple versions
22    (e.g., `cudnn`, `cudnn_8_9`, etc.) all share a single fixup function
23    (i.e., `fixups/cudnn.nix`).
24- `generic-builders`:
25  - Contains a builder `manifest.nix` which operates on the `Manifest` type
26      defined in `modules/generic/manifests`. Most packages are built using this
27      builder.
28  - Contains a builder `multiplex.nix` which leverages the Manifest builder. In
29      short, the Multiplex builder adds multiple versions of a single package to
30      single instance of the CUDA Packages package set. It is used primarily for
31      packages like `cudnn` and `cutensor`.
32- `modules`: Nixpkgs modules to check the shape and content of CUDA
33    redistributable and feature manifests. These modules additionally use shims
34    provided by some CUDA packages to allow them to re-use the
35    `genericManifestBuilder`, even if they don't have manifest files of their
36    own. `cudnn` and `tensorrt` are examples of packages which provide such
37    shims. These modules are further described in the
38    [Modules](./modules/README.md) documentation.
39- `packages`: Contains packages which exist in every instance of the CUDA
40    package set. These packages are built in a `by-name` fashion.
41- `setup-hooks`: Nixpkgs setup hooks for CUDA.
42- `tensorrt`: NVIDIA TensorRT library.
43
44## Distinguished packages
45
46### CUDA Compatibility
47
48[CUDA Compatibility](https://docs.nvidia.com/deploy/cuda-compatibility/),
49available as `cudaPackages.cuda_compat`, is a component which makes it possible
50to run applications built against a newer CUDA toolkit (for example CUDA 12) on
51a machine with an older CUDA driver (for example CUDA 11), which isn't possible
52out of the box. At the time of writing, CUDA Compatibility is only available on
53the Nvidia Jetson architecture, but Nvidia might release support for more
54architectures in the future.
55
56As CUDA Compatibility strictly increases the range of supported applications, we
57try our best to enable it by default on supported platforms.
58
59#### Functioning
60
61`cuda_compat` simply provides a new `libcuda.so` (and associated variants) that
62needs to be used in place of the default CUDA driver's `libcuda.so`. However,
63the other shared libraries of the default driver must still be accessible:
64`cuda_compat` isn't a complete drop-in replacement for the driver (and that's
65the point, otherwise, it would just be a newer driver).
66
67Nvidia's recommendation is to set `LD_LIBRARY_PATH` to point to `cuda_compat`'s
68driver. This is fine for a manual, one-shot usage, but in general setting
69`LD_LIBRARY_PATH` is a red flag. This is global state which short-circuits most
70of other dynamic library resolution mechanisms and can break things in
71non-obvious ways, especially with other Nix-built software.
72
73#### CUDA Compat with Nix
74
75Since `cuda_compat` is a known derivation, the easy way to do this in Nix would
76be to add `cuda_compat` as a dependency of CUDA libraries and applications and
77let Nix do its magic by filling the `DT_RUNPATH` fields. However,
78`cuda_compat` itself depends on `libnvrm_mem` and `libnvrm_gpu` which are loaded
79dynamically at runtime from `/run/opengl-driver`. This doesn't please the Nix
80sandbox when building, which can't find those (a second minor issue is that
81`addOpenGLRunpathHook` prepends the `/run/opengl-driver` path, so that would
82still take precedence).
83
84The current solution is to do something similar to `addOpenGLRunpathHook`: the
85`addCudaCompatRunpathHook` prepends to the path to `cuda_compat`'s `libcuda.so`
86to the `DT_RUNPATH` of whichever package includes the hook as a dependency, and
87we include the hook by default for packages in `cudaPackages` (by adding it as a
88inputs in `genericManifestBuilder`). We also make sure it's included after
89`addOpenGLRunpathHook`, so that it appears _before_ in the `DT_RUNPATH` and
90takes precedence.