Update the change from commit 2a94c762ed (FindCUDAToolkit: Add support
for CUDA::nvrtc_static, 2023-01-20, v3.26.0-rc1~55^2). The lib is named
`libnvrtc-builtins_static.a`, not `libnvrtc_builtins_static.a`.
989d50d7fc FindCUDAToolkit: Support nvhpc splayed layouts without symlinks
207518b6e8 FindCUDAToolkit: Handle CUDAToolkit_TARGET_DIR dir being a symlink
Acked-by: Kitware Robot <kwrobot@kitware.com>
Merge-request: !7945
nvtx3 is a header-only replacement for the previous shared library
implementations.
I implemented it as a separate target since while the header names match and
ideally it should be API compatible, forcing its include directory into the old
target would lengthen the include search path and could cause confusion or
possible build differences for projects using multiple build systems. This
keeps it explicit as a developer opt-in.
Implements: #21377Resolves: #23835
CUDA's cupti library has its headers in a seperate directory on a
standard CUDA install, but `CUDA::cupti` only adds the default cuda
include directory.
Issue: #22761
Commit 14d8a276 (CUDA: Support nvcc 11.5 new -arch=all|all-major flags,
2021-08-17) added all and all-major options to CUDA_ARCHITECTURES. These are
fairly generic and likely to see real-world use by distributors. Thus it's
desirable to support these also for Clang and older NVCC versions.
The supported architectures are dependent on the toolkit version. We determine
the toolkit version prior to compiler detection. For NVCC we get the version
from the vendor identification output, but for Clang we need to invoke NVCC
separately.
The architecture information is mostly based on the Wikipedia list with the
earliest supported version being CUDA 7.0. This could be documented and
expanded in the future to allow projects to query CUDA toolkit version and
architecture information.
For Clang we additionally constrain based on its support.
Additionally the architecture mismatch detection logic is fixed, improved and
updated for generic support:
* Commit 01428c55 (CUDA: Fail fast if CMAKE_CUDA_ARCHITECTURES doesn't work
during detection, 2020-08-29) enabled CMAKE_CUDA_COMPILER_ID_REQUIRE_SUCCESS
if CMAKE_CUDA_ARCHITECTURES is specified. This results in
CMakeDetermineCompilerID.cmake printing the compiler error and our code for
presenting the mismatch in a user-friendly way being useless. The custom
logic seems preferable so go back to not enabling it.
* Commit 14d8a276 (CUDA: Support nvcc 11.5 new -arch=all|all-major flags,
2021-08-17) tried to support CMP0054 but forgot to add x to the interpolated
result. Thus the conditions would always evaluate to false. This is fixed as
a byproduct of removing NVIDIA specific checks, improving the error message
and replacing architectures_mode with a simpler architectures_explicit.
Visual Studio support omits testing the flags during detection due to
complexities in determining the toolkit version when using it.
A long-term proper implementation would be #23161.
Implements #22860.
The NVHPC packages bundle the CUDA math libraries in a sibling
directory (`math_libs`) instead of in with the rest of the
cuda libraries.
Depending on the NVHPC package the math_libs folder can have
versioned subdirectories, therefore we prefer finding the
same versions as the CUDA Toolkit and falling back to the
latest when not possible.
Before this a downstream code linking to `CUDA::cusparse_static` and
`CUDA::curand_static` would get a link line with `libcusparse_static.a`,
then `libculibos.a`, then `libcurand_static.a`. Use `IMPORTED_LOCATION`
to tell CMake about the proper dependency ordering where `libculibos.a`
comes last, because the other two libraries depend on `libculibos.a`.
Fixes: #22365
The original regular expression was greedy and would match any
environment variable ending with `TOP` (like `DESKTOP`). This is an
issue on windows where `nvcc -v` would output all environment variables
before the compiler's verbose output.
To resolve this issue we use a tighter match algorithm that looks
for `#$ TOP=` instead of `TOP=`.
Fixes: #22158
Since commit fb2afef620 (CUDA: Support nvcc symlinking to ccache,
2021-01-07) and commit 3cef91a321 (CUDA: Always extract CUDA Toolkit
root from nvcc verbose output, 2021-02-03) we always run the command
`nvcc -v __cmake_determine_cuda` to look for the toolkit root in its
stderr. On Windows, that command may print to stdout instead, so
capture that as well.
Fixes#21750, #21763
Given that NVCC can be provided by multiple different sources (NVIDIA HPC SDK, CUDA Toolkit, distro)
each of which has a different layout, we need to extract the CUDA toolkit root from the compiler
itself, allowing us to support numerious different scattered toolkit layouts.
The NVIDIA HPC SDK specifically ships two copies of nvcc one in
`compilers/bin/` and one in `cuda/bin`. Thus when using
`compilers/bin/nvcc` the Toolkit root logic fails.