Files
llvm-project/flang/docs/RuntimeEnvironment.md
Sairudra More 111bafff9b [flang] Add runtime trampoline pool for W^X compliance (#183108)
Flang currently lowers internal procedures passed as actual arguments
using LLVM's `llvm.init.trampoline` / `llvm.adjust.trampoline`
intrinsics, which require an executable stack. On modern Linux
toolchains and security-hardened kernels that enforce W^X (Write XOR
Execute), this causes link-time failures (`ld.lld: error: ... requires
an executable stack`) or runtime `SEGV` from NX violations.

This patch introduces a runtime trampoline pool that allocates
trampolines from a dedicated `mmap`'d region instead of the stack. The
pool toggles page permissions between writable (for patching) and
executable (for dispatch), so the stack stays non-executable throughout.
On macOS, MAP_JIT and `pthread_jit_write_protect_np` are used for the
same effect. An i-cache flush (`__builtin___clear_cache` on Linux,
`sys_icache_invalidate` on macOS) is performed after each write→exec
transition.

The feature is gated behind a new driver flag, `-fsafe-trampoline` (off
by default), which threads through the frontend into the
`BoxedProcedurePass`. When enabled, the pass emits calls to
`_FortranATrampolineInit`, `_FortranATrampolineAdjust`, and
`_FortranATrampolineFree` instead of the legacy intrinsics. The legacy
path is completely untouched when the flag is off.

The pool is a singleton with a fixed capacity (default 1024 slots,
overridable via `FLANG_TRAMPOLINE_POOL_SIZE`). Slot size varies by
target (32 bytes on x86-64/AArch64, 48 on PPC64, 64 fallback). Each slot
holds a small architecture-specific stub, currently x86-64 (17 bytes,
using `r10` as the nest/static-chain register) and AArch64 (24 bytes,
using `x15`). The implementation compiles on all architectures but will
crash at runtime with a clear diagnostic if trampoline emission is
actually attempted on an unsupported target. This avoids breaking the
flang-rt build on e.g. RISC-V or PPC64.

Freed slots are poisoned (the callee pointer is overwritten with a
sentinel) and recycled into a freelist, so the pool can sustain
long-running programs that repeatedly create and destroy closures.

A few design choices worth calling out:

The runtime avoids all C++ runtime dependencies, no `std::mutex`, no
`operator new`, no function-local statics with hidden guard variables.
Locking is via flang-rt's own `Lock` / `CriticalSection`, memory is via
`AllocateMemoryOrCrash` / `FreeMemory`, and the singleton uses explicit
double-checked locking with a raw pointer. This was done so the
trampoline pool links cleanly in minimal / freestanding flang-rt
configurations.

`_FortranATrampolineFree` calls are inserted immediately before every
`func.return` in the enclosing host function. This is a conservative but
correct strategy. The trampoline handle cannot outlive the host's stack
frame since the closure captures the host's local variables by
reference.

The GNU_STACK note is verified via a dedicated integration test
(`safe-trampoline-gnustack.f90`) that compiles and links a Fortran
program using the runtime path, then inspects the ELF with
`llvm-readelf` to confirm the stack segment is `RW` (not `RWE`).

**Test coverage:**

- `flang/test/Driver/fsafe-trampoline.f90` — flag forwarding (on, off,
default)
- `flang/test/Fir/boxproc-safe-trampoline.fir` — FIR-level FileCheck for
emitted runtime calls
- `flang/test/Lower/safe-trampoline.f90` — end-to-end lowering
- `flang-rt/test/Driver/safe-trampoline-gnustack.f90` — GNU_STACK ELF
verification

Closes #182813

Co-authored-by: Sairudra More <moresair@pe31.hpc.amslabs.hpecorp.net>
2026-03-10 16:16:05 +05:30

3.0 KiB

---
local:
---

Environment variables of significance to Fortran execution

A few environment variables are queried by the Fortran runtime support library.

The following environment variables can affect the behavior of Fortran programs during execution.

DEFAULT_UTF8=1

Set DEFAULT_UTF8 to cause formatted external input to assume UTF-8 encoding on input and use UTF-8 encoding on formatted external output.

FORT_CONVERT

Determines data conversions applied to unformatted I/O.

  • NATIVE: no conversions (default)
  • LITTLE_ENDIAN: assume input is little-endian; emit little-endian output
  • BIG_ENDIAN: assume input is big-endian; emit big-endian output
  • SWAP: reverse endianness (always convert)

FORT_CHECK_POINTER_DEALLOCATION

Fortran requires that a pointer that appears in a DEALLOCATE statement must have been allocated in an ALLOCATE statement with the same declared type. The runtime support library validates this requirement by checking the size of the allocated data, and will fail with an error message if the deallocated pointer is not valid. Set FORT_CHECK_POINTER_DEALLOCATION=0 to disable this check.

FORT_FMT_RECL

Set to an integer value to specify the record length for list-directed and NAMELIST output. The default is 72.

NO_STOP_MESSAGE

Set NO_STOP_MESSAGE=1 to disable the extra information about IEEE floating-point exception flags that the Fortran language standard requires for STOP and ERROR STOP statements.

FORT_TRUNCATE_STREAM

Set FORT_TRUNCATE_STREAM=1 to make output to a formatted unit with ACCESS="STREAM" truncate the file when the unit has been repositioned via POS= to an earlier point in the file. This behavior is analogous to the implicit writing of an ENDFILE record when output takes place to a sequential unit after executing a BACKSPACE or REWIND statement. Truncation of a stream-access unit is common to several other compilers, but it is not mentioned in the standard.

FORT_NO_EMPTY_ALLOCATION

Set FORT_NO_EMPTY_ALLOCATION=1 to cause ALLOCATE statements fail when the allocated size is empty.

FLANG_TRAMPOLINE_POOL_SIZE

Set FLANG_TRAMPOLINE_POOL_SIZE to an integer value to control the maximum number of runtime trampoline slots available when -fsafe-trampoline is enabled. Each slot consists of a small executable code stub (size varies by target; e.g. 32 bytes on x86-64 and AArch64) backed by a writable data entry. The default is 1024 slots, which is sufficient for typical Fortran programs. If more internal-procedure closures are alive simultaneously than the pool can hold, the runtime terminates with a diagnostic message that includes the current pool capacity.

Example: export FLANG_TRAMPOLINE_POOL_SIZE=4096