Files
llvm-project/libc/startup/gpu/start.cpp
Joseph Huber 049cfda67c [LLVM] Port 'llvm-gpu-loader' to use LLVMOffload (#162739)
Summary:
This patch rewrites the `llvm-gpu-loader` utility to use the LLVMOffload
interface. This heavily simplifies it while re-using the already
existing support. Another benefit is that I can now easily do this
dynamically so we can always build this utility without needing to find
non-standard packages.

One issue is mentioned in
https://github.com/llvm/llvm-project/issues/159636 where this will now
take extra time if you have both installed on the same machine. This is
just slightly annoying since most people don't have both CUDA and ROCm
at the same time so I don't consider it a blocker. I will work later to
address it.

Slightly unfortunate environment variable usage, I will also expose that
better in the future.

Fixes: https://github.com/llvm/llvm-project/issues/132890
2026-02-24 08:44:29 -06:00

46 lines
1.7 KiB
C++

//===-- Implementation of crt for gpu -------------------------------------===//
//
// Part of the LLVM Project, under the Apache License v2.0 with LLVM Exceptions.
// See https://llvm.org/LICENSE.txt for license information.
// SPDX-License-Identifier: Apache-2.0 WITH LLVM-exception
//
//===----------------------------------------------------------------------===//
#include "config/gpu/app.h"
#include "src/__support/GPU/utils.h"
#include "src/__support/RPC/rpc_client.h"
#include "src/__support/macros/config.h"
#include "src/stdlib/atexit.h"
#include "src/stdlib/exit.h"
extern "C" int main(int argc, char **argv, char **envp);
extern "C" void __cxa_finalize(void *dso);
namespace LIBC_NAMESPACE_DECL {
DataEnvironment app;
} // namespace LIBC_NAMESPACE_DECL
extern "C" [[gnu::visibility("protected"), clang::device_kernel]]
void _begin(int, char **, char **env) {
// The LLVM offloading runtime will automatically call any present global
// constructors and destructors so we defer that handling.
__atomic_store_n(&LIBC_NAMESPACE::app.env_ptr,
reinterpret_cast<uintptr_t *>(env), __ATOMIC_RELAXED);
}
extern "C" [[gnu::visibility("protected"), clang::device_kernel]] void
_start(int argc, char **argv, char **envp, int *ret) {
// Invoke the 'main' function with every active thread that the user launched
// the _start kernel with.
__atomic_fetch_or(ret, main(argc, argv, envp), __ATOMIC_RELAXED);
}
extern "C" [[gnu::visibility("protected"), clang::device_kernel]]
void _end() {
// Only a single thread should call the destructors registred with 'atexit'.
// The loader utility will handle the actual exit and return code cleanly.
__cxa_finalize(nullptr);
}