llvmPackages_rocm: compile as one derivation
This is the supported way rocm is tested.
It makes packaging in nix a *lot* easier (see the code size).
An important change is the dontLink detection in the clang/clang++
wrapper script: When compiling with --cuda-device-only,
the linker must not be set, otherwise e.g. the blender kernels fail to
compile.