Run the `clang-format.bash` script to update all our C and C++ code to a
new style defined by `.clang-format`, now with "east const" enforcement.
Use `clang-format` version 18.
* If you reached this commit for a line in `git blame`, re-run the blame
operation starting at the parent of this commit to see older history
for the content.
* See the parent commit for instructions to rebase a change across this
style transition commit.
Issue: #26123
Architecture 30 was removed with CUDA 11, so most of the CUDA tests fail with
it.
Remove setting the architecture and bump the minimum version to 3.18, so
CMP0104 takes effect and we can rely on the default architecture, which is
guaranteed to be compilable.
Use of __ldg() in ProperLinkFlags was removed as it only affects performance
and is available only on sm_35 and above.
Testing the functionality of CUDA_ARCHITECTURES is already covered by
CudaOnly.Architecture and CudaOnly.CompileFlags.
Fixes#17559
Replace our hard-coded default of cudart=static with a first-class abstraction to select the runtime library from an enumeration of logical names.
Run the `clang-format.bash` script to update all our C and C++ code to a
new style defined by `.clang-format`. Use `clang-format` version 6.0.
* If you reached this commit for a line in `git blame`, re-run the blame
operation starting at the parent of this commit to see older history
for the content.
* See the parent commit for instructions to rebase a change across this
style transition commit.
As kernel launches are asynchronous, a `cudaGetLastError()` right after
the kernel launch might be executed while the kernel is still running.
Synchronizing the device will ensure that all the work is completed
before progressing further on, and allows to catch errors that were
previously missed.
The `cudaGetLastError()` after the `cudaDeviceSynchronize()` is there
to reset the error variable to `cudaSuccess`.