Commit Graph

578498 Commits

Author SHA1 Message Date
Paulius Velesko
264ac2d3af [HIP][MacOS] Mach-O support and Darwin toolchain fixes (#183991)
This PR adds support for HIP on macOS: Mach-O section naming, Darwin
host toolchain initialization guards, and HIPSPV behavior when Darwin is
the host.

This has been verified using chipStar on MacOS via the PoCL OpenCL
implementation.

## Uninitialized target workaround
Darwin’s toolchain is only initialized when its own TranslateArgs runs.
For HIP/CUDA device jobs, Darwin is used as the HostTC and never gets
its args translated, so its target stays uninitialized. The new checks
avoid asserting on that uninitialized state. A better long-term fix is
to initialize Darwin earlier (see the FIXME in Driver.cpp
BuildJobsForAction).

- [ ] Initialize Darwin toolchain during construction instead of lazily
in TranslateArgs. See Driver.cpp BuildJobsForAction FIXME.

- [x] In Darwin’s addClangTargetOptions, skip host-stdlib flags when
DeviceOffloadKind != OFK_None so HIPSPV can safely delegate to the host.
2026-04-28 12:43:59 -05:00
Amina Chabane
dddd0da8e6 [BOLT][AArch64] Refuse to run IndirectCallPromotion pass (#194363)
`--icp=<value>`/`--indirect-call-promotion=<value>` results in an
`UNIMPLEMENTED` crash when invoked as it is unimplemented in AArch64.

- Guard IndirectCallPromotion for non-X86
- Update unsupported-passes.test with expected error
2026-04-28 18:39:15 +01:00
Benjamin Luke
9e0057bdc4 [clang] [fixit] Properly apply warning options during fixit-recompile (#190280)
Fixes https://github.com/llvm/llvm-project/issues/18707

During fixit recompile, the frontend was not reapplying command-line
diagnostic options, so the second pass could lose -Wno-* suppressions
and other warning configuration.

Added regression test to make sure that diagnostic options are properly
applied in the fixit-recompile path.
2026-04-28 10:38:16 -07:00
Grigory Pastukhov
b40c1d511b [LLVM] Fix use-after-free in AlwaysInliner flatten worklist (#194485)
Functions with both `alwaysinline` and `flatten` attributes were
collected into the `NeedFlattening` worklist, then erased during
always-inline processing, leaving dangling pointers. Fix by collecting
flatten functions after the always-inline loop, and eliminate the
separate worklist by iterating the module directly.
2026-04-28 10:35:29 -07:00
Hans Wennborg
cbb012fa03 [Support] Mark string-returning sys::path::native nodiscard (#194675)
To make it clear that it doesn't modify the path in place like the other
overloads. Follow-up to #193228
2026-04-28 17:29:24 +00:00
Eugene Epshteyn
dc1d85c055 [flang][NFC] Converted five tests from old lowering to new lowering (part 52) (#194525)
Converted Lower/user-defined-operators.f90,
Lower/variable-inquiries.f90, Lower/where-allocatable-assignments.f90,
Lower/where.f90, and Transforms/constant-argument-globalisation.fir from
legacy lowering (-hlfir=false / -flang-deprecated-no-hlfir) to new
lowering (-emit-hlfir or no flag for FIR-input tests).
2026-04-28 13:21:43 -04:00
Stephen Tozer
88b9b2533c [LLVM] Disable IO sandbox in symbolizeAddresses (#194597)
The function `symbolizeAddresses` is used by debugify to symbolize
addresses captured in the current invocation of LLVM, which it does by
executing llvm-symbolizer with temporary input and output files.
Creating the temporary files has an explicit sandbox exclusion, as
temporary files are necessarily not part of the compiler's formal
output, but attempting to read back the output file via MemoryBuffer
triggers a sandbox violation. Since we are always only operating on
temporary files within symbolizeAddresses, this patch disables the IO
sandbox in that function.
2026-04-28 18:15:52 +01:00
Hongyu Chen
ff6269d116 [BDCE] Avoid replacement of self-referential instructions (#194614)
Fixes #194564.
2026-04-29 01:12:17 +08:00
Krzysztof Parzyszek
928f70d38e [TableGen] Emit constexpr versions of some directive/clause functions (#194633)
A variant of https://github.com/llvm/llvm-project/pull/176253 with a
change to reduce compile-time impact.

Since "llvm_unreachable" is actually allowed in constexpr functions,
simply emit the bodies of the selected functions in the header file.

In the previous PR the `isAllowedClauseForDirective` function was made
constexpr, but since it was very long it had a significant impact on
compilation time. In this PR that function is no longer constexpr.
2026-04-28 12:11:21 -05:00
Jonas Paulsson
de6af1ff34 [SystemZ] Improved testing for memcpy/memmove/memset. (#194682)
This is a pre-commit for #187100.
2026-04-28 19:09:09 +02:00
Andy Kaylor
8ea2b587c0 [CIR] Avoid duplicate name collisions in LoweringPrepare (#194469)
This fixes a bug in the CIR LoweringPrepare pass where we were creating
multiple constant initializer global values with the same name, causing
references to them (specifically cir.get_global) to get the wrong value.

Assisted-by: Cursor / claude-4.7-opus-xhigh
2026-04-28 09:50:19 -07:00
Abid Qadeer
ba6861c2bc [OpenMPIRBuilder] Cast device num_threads to i32 for __kmpc_parallel_60 (#194634)
I observed a crash in device OpenMP lowering when compiling with
`-fdefault-integer-8`. In `targetParallelCallback`, `NumThreads` can be
`i64`, but `__kmpc_parallel_60` expects an `i32` `num_threads`
parameter, which caused a bad-signature assertion during call creation.

The fix is to use `CreateZExtOrTrunc(..., Int32)` for the `num_threads`
argument before building the runtime call. This matches the handling
used in clang in `CGOpenMPRuntimeGPU::emitParallelCall`.

The problem can be seen with the following testcase whe compiled with
`flang -fopenmp --offload-arch=gfx90a test.f90 -fdefault-integer-8``

```
program test
  implicit none
  integer :: nthreads
  integer :: i
  nthreads = 137
  !$omp target teams distribute parallel do num_threads(nthreads)
  do i = 1, 1
  end do
  !$omp end target teams distribute parallel do
end program test
```
2026-04-28 17:46:19 +01:00
Zhen Wang
4d676e56f0 [flang][cuda] Preserve fir.rebox captured by cuf.kernel via CUDAKernelOpInterface (#193890)
Reland of #193837 (reverted in #193855), now using a marker op interface
to avoid the link cycle that broke `BUILD_SHARED_LIBS=ON` builds.

`SimplifyArrayCoorOp` folded `fir.rebox` into `fir.array_coor` across a
`cuf.kernel` boundary. CUF lowering needs the captured rebox to
materialize a managed-memory descriptor for the kernel; folding it away
makes the kernel dereference the host-side descriptor and crash with
`cudaErrorIllegalAddress`.

Fix is to add `fir::CUDAKernelOpInterface`, a marker op interface
defined in FIRDialect and implemented by `cuf.kernel`. The
canonicalization guard queries the interface, so the `TypeIDResolver`
symbol lives in `libFIRDialect.so` and no `FIR -> CUF` link edge is
introduced.
2026-04-28 09:08:13 -07:00
Amit Tiwari
3c14034c55 [Flang][OpenMP] Validate omp_initial_device omp_invalid_device as device IDs (#193669)
As per OpenMP 5.2/6.0 the below are valid device values in a `#pragma
omp target` directive:

omp_initial_device (-1) -> refers to the host CPU.
omp_invalid_device (-2) -> an intentionally invalid device, used to
trigger a runtime error.

For the 2 values discussed above flang fails with:

```
error: The device expression of the DEVICE clause must be a positive integer expression
      !$OMP TARGET DEVICE(-1)
error: Must have INTEGER type, but is REAL(4)
      !$OMP TARGET DEVICE(OMP_INVALID_DEVICE)

```
Issue: https://github.com/llvm/llvm-project/issues/192989
2026-04-28 21:31:04 +05:30
Petr Kurapov
5320bda232 [AMDGPU] Enable lane masks tracking in coexec scheduler. (#194578)
Prevents the scheduler to silently produce invalid IR.
2026-04-28 17:50:29 +02:00
Louis Dionne
d7ed6d8c9f [libc++] Improvements to the benchmark runners (#194659)
- Run the ref workloads on SPEC
- Record the SPEC version in the machine info
- Allow filtering which benchmarks are run in run-benchbot
2026-04-28 11:50:06 -04:00
Ryosuke Niwa
ec0675591f Add the support for adoptCFNullable/adoptNSNullable (#194539)
These are two new "adopt" functions to be introduced in WebKit.
2026-04-28 08:48:16 -07:00
Ramkumar Ramachandra
251ed1eb84 [VPlan] Optz WideCanIV with SIVSteps over CanIV (#191276)
Replace WideCanonicalIV with a ScalarIVSteps over the CanonicalIV when
only the first lane is used. This is a preparatory step in enabling
expansion of WideCanonicalIV into executable recipes.
2026-04-28 15:46:48 +00:00
Benjamin Stott
7ceac4b1af [Lit] Open sub-processes with text=False (#194577)
This PR is part of a series of patches upgrading Lit's in-process
built-ins to be able to run with piped input/output and full redirection
support, and to allow custom in-process builtns to be provided via the
Lit config. The remaining patches to Lit's test runner can be found here:
https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins.

This is part of the Lit daemonized testing project:
https://discourse.llvm.org/t/88612

This PR makes Lit open all sub-processes with `text=False`, so that the
Python code will be able to read and write binary data to and from their
IO streams. This currently causes no functional change, as when Lit
reads output from the sub-processes, it already handles the case that
the read output is `bytes` by decoding it, but we will need to be able
to read binary data from a sub-process's STDIN if its output, which may
be binary, is piped into an in-process built-in, and we will need to be
able to write binary data to a sub-process's STDOUT if its input is
piped from an in-process builtin.

I have made sure that on Windows, when a sub-process invoked by Lit has
its output redirected to a file by Lit, the `\n -> \r\n` conversion is
performed as usual when writing to the file from the process - this
change only affects how the Python code interacts with the streams.
2026-04-28 16:36:28 +01:00
Michael Klemm
40ad10a8ae [Flang] Fix -Wopen-mp-* and -Wopen-acc-* flag spellings (#188434)
The CamelCase-to-hyphenated conversion was incorrectly splitting
"OpenMP" and "OpenACC" into "open-mp" and "open-acc", producing wrong -W
flag names like -Wopen-mp-usage instead of -Wopenmp-usage. Fix the
conversion to treat these as compound names, keep the old spellings as
deprecated aliases, and emit a warning when deprecated spellings are
used.

---------

Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
2026-04-28 17:26:29 +02:00
Ryan Buchner
eecec864d1 [SLP] Add tests for boundary case with MinProfitableStridedOps (#194507)
Currently we don't vectorize runtime strided loads when `VF == MinProfitableStridedOps`.
2026-04-28 08:19:57 -07:00
Luke Hutton
4c6ae8f5be [mlir][tosa] Verify the output shape of tosa.mul and tosa.rescale (#193952)
Verifying the provided output shape against an expected shape helps
diagnose issues on op construction.
2026-04-28 16:11:04 +01:00
Benjamin Stott
d769ce2176 [Lit] Change processRedirects to open all files in binary mode (#194368)
This PR is the second in a series of patches upgrading Lit's in-process
built-ins to be able to run with piped input/output and full redirection
support, and to allow custom in-process builtns to be provided via the
Lit config. The remaining patches to Lit's test runner can be found here@
https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins.

This is part of the Lit daemonized testing project:
https://discourse.llvm.org/t/88612.

This PR makes Lit's `processRedirects` function open all input/output
files in binary mode. This makes sure that in-process builtins have the
expected behaviour when reading and writing from them:

Newline translation is not required for any of the current in-process
built-ins, in fact, the in-process built-in for `echo`, which is the
only one that writes to `stdout`, explicitly re-opens the output file
with `newline=""` on Windows, to avoid newline translation. Also,
in-process builtins will eventually need to be able to read or write
binary data: for example, `opt` without `-S` running in daemon mode.

I believe this has no functional change for regular process invocations;
I have confirmed that programs invoked by Lit which write to files
opened in binary mode by Lit still have the newline translation
performed as normal on Windows, unless they change the mode of their
output stream themselves.
2026-04-28 15:50:05 +01:00
Daniel Hernandez-Juarez
5d48370b26 [mlir][memref] Pass TypeConverter to ConvertMemrefStore (#194356)
Commit 20b925a28a dropped the TypeConverter from ConvertMemrefStore
when adding the disableAtomicRMW flag. Restore it.
2026-04-28 16:44:30 +02:00
Charles Zablit
a2f9da54cb [lldb][windows] fix race condition in ConPTY on process exit (#194631) 2026-04-28 15:39:47 +01:00
Srividya Sundaram
0052113fb6 [SYCL][Driver] Set -std=c++17 as default for SYCL compilations (#194014)
This PR ensures SYCL compilations default to C++17 when no explicit
standard is specified, and validates that user-provided standards meet
SYCL's C++17 minimum requirement. It also fixes Windows MSVC compilation
by enabling -fms-extensions for SYCL device code.
2026-04-28 14:39:26 +00:00
Nico Weber
d089e7397e [gn build] Port f5b6e4fc20 (#194645) 2026-04-28 10:37:55 -04:00
Andrzej Warzyński
bfaab0ec1d [clang][cir][nfc] Add missing comment (#194644) 2026-04-28 15:37:29 +01:00
Krzysztof Parzyszek
268bac6c25 [flang][OpenMP] Move implementation detail from header to source, NFC (#194638) 2026-04-28 09:29:31 -05:00
Adel Ejjeh
5b82a26451 [AMDGPU][NFC] Remove redundant Args.size() assertions from AMDGPUMCExpr (#194488)
Remove redundant `Args.size()` assertions from `AMDGPUMCExpr` evaluate
functions (`evaluateExtraSGPRs`, `evaluateTotalNumVGPR`,
`evaluateAlignTo`, `evaluateOccupancy`).

These assertions are redundant with the `zip_equal` size checking
performed in the `evaluateMCExprs` helper function introduced in
#193859.

---

*This PR was developed with AI assistance (GitHub Copilot).*
2026-04-28 09:27:22 -05:00
Stephan T. Lavavej
77cfc55ed4 [MLIR] Update minimatch dependency in VSCode plugin, resolving security alerts (#188613) 2026-04-28 07:12:59 -07:00
Alexey Bataev
61d795c797 [SLP][NFC]Cache IsExternallyUsed by Value in cost computation
Same V is commonly seen in multiple TEs (shared scalars), and the
expensive part of IsExternallyUsed walks V->users() with multiple
match() pattern checks plus per-user getTreeEntries lookups - all
V-only-dependent. Split out the V-dependent body and memoize by
Value pointer, leaving the TE-specific copyable check at the call
site. DeletedNodes is read-only during the cost loop, so caching
is safe.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194637
2026-04-28 10:06:41 -04:00
Charles Zablit
b0f3cd1020 [lldb][windows] fix a race condition in IO reader thread (#194422) 2026-04-28 14:57:56 +01:00
Simon Pilgrim
da0455adab [PhaseOrdering][X86] vector-reductions-expanded.ll - use passes list instead of piped opt stages (#194608)
Cleanup to make it easier to regenerate checks for #194473
2026-04-28 13:52:29 +00:00
Anshul Nigham
e6e9e1f528 [Docs] Fixes indents for InstrRefDebugInfo and KeyInstructionsDebugInfo (#194532)
This distinguishes the doc title from the headers.

Fixes navigation indents for Furo theme update (see
https://github.com/llvm/llvm-project/pull/184440).
2026-04-28 06:44:47 -07:00
Felipe de Azevedo Piovezan
74781cf395 [lldb] Disable gdbremote test on windows (#194627)
This is causing bot failures.
2026-04-28 14:40:22 +01:00
Steven Perron
ca27dc2933 [SPIR-V] Matrix in struct pointer legalization (#193073)
When looking to load an object at the start of a struct, the types do
not always match exactly. When we have an HLSL matrix the type in the
load will not match the type in memory. We need to improve the pointer
legalization pass to look for any "compatible" type at the start of an
aggragate.

A compatible are two types that the pass knows know to convert from one
to another.

This involves a refactoring of the code to make the check more general.

Assisted-by: Gemini


<!-- branch-stack-start -->

<!-- branch-stack-end -->
2026-04-28 09:36:55 -04:00
Kai Nacke
e459ce5077 Revert "[PowerPC] Enable using HwMode for instructions (#191051)" (#194464)
This reverts commit 2a83068537.

It causes test suite failures in the 7zip benchmark.
2026-04-28 09:35:38 -04:00
Alexey Bataev
4e030aeec0 [SLP][NFC]Cache MightBeIgnored result in gather-shuffle analysis
Each V in VL is queried up to 3 times for MightBeIgnored (direct +
NeighborMightBeIgnored from both neighbors), and the underlying
areAllUsersVectorized walks the instruction's user list. Memoize per
Value pointer to avoid the redundant walks.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194619
2026-04-28 09:32:15 -04:00
Jeff Bailey
a2409e07ce [libc][NFC] Move sys/ucontext.h to YAML generation (#194573)
Renamed sys/ucontext.h to sys/ucontext.h.def and created a corresponding
sys/ucontext.yaml, following the pattern used by sys/prctl. Updated
CMakeLists.txt to use add_header_macro.

Also removed the orphaned top-level ucontext.h.def which was never
referenced by ucontext.yaml.
2026-04-28 13:29:08 +00:00
Naveen Seth Hanig
cd950962a9 [clang][modules-driver] Further constrain import-std test (#194604)
The root cause for the failing test was found in
https://github.com/llvm/llvm-project/pull/194475#issuecomment-4335023585.
The test uses `--target=x86_64-linux-gnu` which is only available with
`-DLLVM_TARGETS_TO_BUILD=all` or on native x86 targets.
2026-04-28 15:28:41 +02:00
Balázs Benics
b48aa05f39 [llvm] Mark IOSandbox::ScopedSetting nodiscard and maybe_unused (#194602)
The goal is to have the same attributes on ScopedSetting regardless if
this cmake setting is enabled or not.

Both of these should have nodiscard and maybe_unused attributes.
2026-04-28 14:14:58 +01:00
Serosh
83164a43a1 [Clang] fix assertion failure in ::template operator parsing (#194097)
when parsing an invalid `::template operator`, the parser incorrectly
kept the consumed tokens on error. This caused the token cache to go out
of sync and crash. This patch fixes it by reverting the tokens and
properly returning the error
fixes #186582
2026-04-28 06:13:26 -07:00
Alexey Bataev
ad2390871a [SLP][NFC]Cache isUsedOutsideBlock results in gather-shuffle analysis
Hoist loop-invariant predicates and memoize per-UserTE
all_of(Scalars, isUsedOutsideBlock) in
isGatherShuffledSingleRegisterEntry and vectorizeTree to avoid
redundant walks over scalar user lists in the gather-shuffle hot path.

Reviewers: 

Pull Request: https://github.com/llvm/llvm-project/pull/194612
2026-04-28 09:05:27 -04:00
jeanPerier
a6df7eb063 [flang] allow rebox/embox of OPTIONAL (#194319)
Delay materialization of branches when building local temporary
descriptor for OPTIONAL from hlfir-to-fir until pre-cg-rewrite.
This makes the IR easier to analyze with OPTIONAL (for instance alias
analysis does not need to handle the branches to find the source).

This is done by adding an "optional" attribute to fir.embox, fir.rebox,
and fir.rebox_assumed_rank to indicate that their cogeneration must be
conditional.

The conditional aspect is implemented in pre-cg-rewrite to avoid
complexifying codegen and the fir.cg dialect.

Assisted by: Claude
2026-04-28 15:00:46 +02:00
Timm Baeder
9d3f2372ac [clang][bytecode] Don't start record field lifetime by default (#193496)
Even though we have per-field lifetime information we did not previously
diagnose this test:
```c++
  struct R {
    struct Inner { constexpr int f() const { return 0; } };
    int a = b.f();
    Inner b;
  };
  constexpr R r;
```
because the life time was started by default.

This patch makes record members be `Lifetime::NotStarted` by default
(unless they are primitive arrays) and then starts the lifetime when in
`Pointer::initialize()`.
2026-04-28 14:57:59 +02:00
Simon Pilgrim
34e136bc84 [PhaseOrdering][X86] Copy backend horizontal min/max reduction tests to phaseordering (#194601)
As discussed on #194473 - add middleend test coverage to ensure we're
creating vXi8/vXi16 llvm.vector.reduce calls to ensure we can lower to
PHMINPOS instructions

Also demonstrates that we're still not matching partial reduction
patterns in vectorcombine
2026-04-28 13:55:24 +01:00
hev
c5e941d722 [LoongArch] Support VBIT{CLR,SET,REV}I patterns for non-native element sizes (#193719)
Extend vsplat_uimm_{pow2,inv_pow2} matching to allow specifying an
explicit element bit width, enabling recognition of splat patterns whose
logical element size differs from the vector's native element type.

Introduce templated selectVSplatUimm{Pow2,InvPow2} helpers with an
optional EltSize parameter, and add corresponding ComplexPattern
definitions for i8/i16/i32 element widths. This allows TableGen patterns
to match cases such as operating on v8i32/v4i64 vectors with masks
derived from smaller element sizes.

With these changes, AND/OR/XOR operations using inverse power-of-two or
power-of-two splat masks are now correctly selected to VBITCLRI,
VBITSETI, and VBITREVI instructions instead of falling back to vector
logical operations with materialized constants.
2026-04-28 20:52:22 +08:00
Amilendra Kodithuwakku
378b411cf2 [clang][AArch64][SVE2p3][SME2p3] Add intrinsics for v9.7a shift operations (#186087)
Add the following new clang intrinsics based on the ACLE specification
https://github.com/ARM-software/acle/pull/428 (Add alpha support for 9.7
data processing intrinsics)

Multi-vector saturating rounding shift right narrow and interleave
instructions
- SQRSHRN
- svint8_t svqrshrn_s8(svint16x2_t, uint64_t) / svint8_t
svqrshrn_n_s8_s16_x2(svint16x2_t, uint64_t)

- UQRSHRN
- svuint8_t svqrshrn_u8(svuint16x2_t, uint64_t) / svuint8_t
svqrshrn_n_u8_u16_x2(svuint16x2_t, uint64_t)

- SQRSHRUN
- svuint8_t svqrshrun_u8(svint16x2_t, uint64_t) / svuint8_t
svqrshrun_n_u8_s16_x2(svint16x2_t, uint64_t)

Multi-vector saturating shift right narrow and interleave
- SQSHRN
- svint8_t svqshrn_s8(svint16x2_t, uint64_t) / svint8_t
svqshrn_n_s8_s16_x2(svint16x2_t, uint64_t)
- svint16_t svqshrn_s16(svint32x2_t, uint64_t) / svint16_t
svqshrn_n_s16_s32_x2(svint32x2_t, uint64_t)

- UQSHRN
- svuint8_t svqshrn_u8(svuint16x2_t, uint64_t) / svuint8_t
svqshrn_n_u8_u16_x2(svuint16x2_t, uint64_t)
- svuint16_t svqshrn_u16(svuint32x2_t, uint64_t) / svuint16_t
svqshrn_n_u16_u32_x2(svuint32x2_t, uint64_t)

- SQSHRUN
- svuint8_t svqshrun_u8(svint16x2_t, uint64_t) / svuint8_t
svqshrun_n_u8_s16_x2(svint16x2_t, uint64_t)
- svuint16_t svqshrun_u16(svint32x2_t, uint64_t) / svuint16_t
svqshrun_n_u16_s32_x2(svint32x2_t, uint64_t)
2026-04-28 13:48:48 +01:00
Zhige Chen
c28d9076ec [llubi] Implement vector reduction/manipulation intrinsics (#194345)
This PR implements vector reduction and manipulation intrinsics. 

Note that floating-point vector reduction intrinsics are not covered by
this change; they will be added in a follow-up PR after #188453 is
merged.
2026-04-28 14:39:58 +02:00