When crosscompiling we pass the sysroot.
We need to try various architecture flags. Clang doesn't automatically
select one that works. First try the ones that are more likely to work
for modern installations:
* <=sm_50 is deprecated since CUDA 10.2, try sm_52 first for
future compatibility.
* <=sm_20 is removed since CUDA 9.0, try sm_30.
Otherwise fallback to Clang's current default. Currently that's `sm_20`,
the lowest it supports.
Separable compilation isn't supported yet.
Fixes: #16586
The special case added by commit 87df637078 (CUDA: Do not treat CUDA
toolkit include directories as implicit, 2020-02-02, v3.17.0-rc1~31^2)
breaks CMake's protections against changing the compiler's implicit
include directory order. Do this only for the NVIDIA compiler where it
is needed as a workaround to another problem. That compiler does not
put the host compiler's implicit include directories in `-I` paths so we
do not detect them as `CMAKE_CUDA_TOOLKIT_INCLUDE_DIRECTORIES` anyway.
CMake properly detects the toolkit directories as implicit system
includes, but CUDA compilers don't add explicit `-isystem` markups to
these directories when compiling CUDA code. Due to this limitation,
allow users to explicitly specify these directories as SYSTEM dirs.
Fixes: #16464, #19864
Fixes#17559
Replace our hard-coded default of cudart=static with a first-class abstraction to select the runtime library from an enumeration of logical names.
When we report that a compiler was unable to build a simple test
program, indent the output of the attempt so that our message formatting
will show it as a pre-formatted block.