llvm-project

Author	SHA1	Message	Date
Paulius Velesko	264ac2d3af	[HIP][MacOS] Mach-O support and Darwin toolchain fixes (#183991 ) This PR adds support for HIP on macOS: Mach-O section naming, Darwin host toolchain initialization guards, and HIPSPV behavior when Darwin is the host. This has been verified using chipStar on MacOS via the PoCL OpenCL implementation. ## Uninitialized target workaround Darwin’s toolchain is only initialized when its own TranslateArgs runs. For HIP/CUDA device jobs, Darwin is used as the HostTC and never gets its args translated, so its target stays uninitialized. The new checks avoid asserting on that uninitialized state. A better long-term fix is to initialize Darwin earlier (see the FIXME in Driver.cpp BuildJobsForAction). - [ ] Initialize Darwin toolchain during construction instead of lazily in TranslateArgs. See Driver.cpp BuildJobsForAction FIXME. - [x] In Darwin’s addClangTargetOptions, skip host-stdlib flags when DeviceOffloadKind != OFK_None so HIPSPV can safely delegate to the host.	2026-04-28 12:43:59 -05:00
Amina Chabane	dddd0da8e6	[BOLT][AArch64] Refuse to run IndirectCallPromotion pass (#194363 ) `--icp=<value>`/`--indirect-call-promotion=<value>` results in an `UNIMPLEMENTED` crash when invoked as it is unimplemented in AArch64. - Guard IndirectCallPromotion for non-X86 - Update unsupported-passes.test with expected error	2026-04-28 18:39:15 +01:00
Benjamin Luke	9e0057bdc4	[clang] [fixit] Properly apply warning options during fixit-recompile (#190280 ) Fixes https://github.com/llvm/llvm-project/issues/18707 During fixit recompile, the frontend was not reapplying command-line diagnostic options, so the second pass could lose -Wno-* suppressions and other warning configuration. Added regression test to make sure that diagnostic options are properly applied in the fixit-recompile path.	2026-04-28 10:38:16 -07:00
Grigory Pastukhov	b40c1d511b	[LLVM] Fix use-after-free in AlwaysInliner flatten worklist (#194485 ) Functions with both `alwaysinline` and `flatten` attributes were collected into the `NeedFlattening` worklist, then erased during always-inline processing, leaving dangling pointers. Fix by collecting flatten functions after the always-inline loop, and eliminate the separate worklist by iterating the module directly.	2026-04-28 10:35:29 -07:00
Hans Wennborg	cbb012fa03	[Support] Mark string-returning sys::path::native nodiscard (#194675 ) To make it clear that it doesn't modify the path in place like the other overloads. Follow-up to #193228	2026-04-28 17:29:24 +00:00
Eugene Epshteyn	dc1d85c055	[flang][NFC] Converted five tests from old lowering to new lowering (part 52) (#194525 ) Converted Lower/user-defined-operators.f90, Lower/variable-inquiries.f90, Lower/where-allocatable-assignments.f90, Lower/where.f90, and Transforms/constant-argument-globalisation.fir from legacy lowering (-hlfir=false / -flang-deprecated-no-hlfir) to new lowering (-emit-hlfir or no flag for FIR-input tests).	2026-04-28 13:21:43 -04:00
Stephen Tozer	88b9b2533c	[LLVM] Disable IO sandbox in symbolizeAddresses (#194597 ) The function `symbolizeAddresses` is used by debugify to symbolize addresses captured in the current invocation of LLVM, which it does by executing llvm-symbolizer with temporary input and output files. Creating the temporary files has an explicit sandbox exclusion, as temporary files are necessarily not part of the compiler's formal output, but attempting to read back the output file via MemoryBuffer triggers a sandbox violation. Since we are always only operating on temporary files within symbolizeAddresses, this patch disables the IO sandbox in that function.	2026-04-28 18:15:52 +01:00
Hongyu Chen	ff6269d116	[BDCE] Avoid replacement of self-referential instructions (#194614 ) Fixes #194564.	2026-04-29 01:12:17 +08:00
Krzysztof Parzyszek	928f70d38e	[TableGen] Emit constexpr versions of some directive/clause functions (#194633 ) A variant of https://github.com/llvm/llvm-project/pull/176253 with a change to reduce compile-time impact. Since "llvm_unreachable" is actually allowed in constexpr functions, simply emit the bodies of the selected functions in the header file. In the previous PR the `isAllowedClauseForDirective` function was made constexpr, but since it was very long it had a significant impact on compilation time. In this PR that function is no longer constexpr.	2026-04-28 12:11:21 -05:00
Jonas Paulsson	de6af1ff34	[SystemZ] Improved testing for memcpy/memmove/memset. (#194682 ) This is a pre-commit for #187100.	2026-04-28 19:09:09 +02:00
Andy Kaylor	8ea2b587c0	[CIR] Avoid duplicate name collisions in LoweringPrepare (#194469 ) This fixes a bug in the CIR LoweringPrepare pass where we were creating multiple constant initializer global values with the same name, causing references to them (specifically cir.get_global) to get the wrong value. Assisted-by: Cursor / claude-4.7-opus-xhigh	2026-04-28 09:50:19 -07:00
Abid Qadeer	ba6861c2bc	[OpenMPIRBuilder] Cast device num_threads to i32 for __kmpc_parallel_60 (#194634 ) I observed a crash in device OpenMP lowering when compiling with `-fdefault-integer-8`. In `targetParallelCallback`, `NumThreads` can be `i64`, but `__kmpc_parallel_60` expects an `i32` `num_threads` parameter, which caused a bad-signature assertion during call creation. The fix is to use `CreateZExtOrTrunc(..., Int32)` for the `num_threads` argument before building the runtime call. This matches the handling used in clang in `CGOpenMPRuntimeGPU::emitParallelCall`. The problem can be seen with the following testcase whe compiled with `flang -fopenmp --offload-arch=gfx90a test.f90 -fdefault-integer-8`` ``` program test implicit none integer :: nthreads integer :: i nthreads = 137 !$omp target teams distribute parallel do num_threads(nthreads) do i = 1, 1 end do !$omp end target teams distribute parallel do end program test ```	2026-04-28 17:46:19 +01:00
Zhen Wang	4d676e56f0	[flang][cuda] Preserve fir.rebox captured by cuf.kernel via CUDAKernelOpInterface (#193890 ) Reland of #193837 (reverted in #193855), now using a marker op interface to avoid the link cycle that broke `BUILD_SHARED_LIBS=ON` builds. `SimplifyArrayCoorOp` folded `fir.rebox` into `fir.array_coor` across a `cuf.kernel` boundary. CUF lowering needs the captured rebox to materialize a managed-memory descriptor for the kernel; folding it away makes the kernel dereference the host-side descriptor and crash with `cudaErrorIllegalAddress`. Fix is to add `fir::CUDAKernelOpInterface`, a marker op interface defined in FIRDialect and implemented by `cuf.kernel`. The canonicalization guard queries the interface, so the `TypeIDResolver` symbol lives in `libFIRDialect.so` and no `FIR -> CUF` link edge is introduced.	2026-04-28 09:08:13 -07:00
Amit Tiwari	3c14034c55	[Flang][OpenMP] Validate `omp_initial_device` `omp_invalid_device` as device IDs (#193669 ) As per OpenMP 5.2/6.0 the below are valid device values in a `#pragma omp target` directive: omp_initial_device (-1) -> refers to the host CPU. omp_invalid_device (-2) -> an intentionally invalid device, used to trigger a runtime error. For the 2 values discussed above flang fails with: ``` error: The device expression of the DEVICE clause must be a positive integer expression !$OMP TARGET DEVICE(-1) error: Must have INTEGER type, but is REAL(4) !$OMP TARGET DEVICE(OMP_INVALID_DEVICE) ``` Issue: https://github.com/llvm/llvm-project/issues/192989	2026-04-28 21:31:04 +05:30
Petr Kurapov	5320bda232	[AMDGPU] Enable lane masks tracking in coexec scheduler. (#194578 ) Prevents the scheduler to silently produce invalid IR.	2026-04-28 17:50:29 +02:00
Louis Dionne	d7ed6d8c9f	[libc++] Improvements to the benchmark runners (#194659 ) - Run the ref workloads on SPEC - Record the SPEC version in the machine info - Allow filtering which benchmarks are run in run-benchbot	2026-04-28 11:50:06 -04:00
Ryosuke Niwa	ec0675591f	Add the support for adoptCFNullable/adoptNSNullable (#194539 ) These are two new "adopt" functions to be introduced in WebKit.	2026-04-28 08:48:16 -07:00
Ramkumar Ramachandra	251ed1eb84	[VPlan] Optz WideCanIV with SIVSteps over CanIV (#191276 ) Replace WideCanonicalIV with a ScalarIVSteps over the CanonicalIV when only the first lane is used. This is a preparatory step in enabling expansion of WideCanonicalIV into executable recipes.	2026-04-28 15:46:48 +00:00
Benjamin Stott	7ceac4b1af	[Lit] Open sub-processes with text=`False` (#194577 ) This PR is part of a series of patches upgrading Lit's in-process built-ins to be able to run with piped input/output and full redirection support, and to allow custom in-process builtns to be provided via the Lit config. The remaining patches to Lit's test runner can be found here: https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins. This is part of the Lit daemonized testing project: https://discourse.llvm.org/t/88612 This PR makes Lit open all sub-processes with `text=False`, so that the Python code will be able to read and write binary data to and from their IO streams. This currently causes no functional change, as when Lit reads output from the sub-processes, it already handles the case that the read output is `bytes` by decoding it, but we will need to be able to read binary data from a sub-process's STDIN if its output, which may be binary, is piped into an in-process built-in, and we will need to be able to write binary data to a sub-process's STDOUT if its input is piped from an in-process builtin. I have made sure that on Windows, when a sub-process invoked by Lit has its output redirected to a file by Lit, the `\n -> \r\n` conversion is performed as usual when writing to the file from the process - this change only affects how the Python code interacts with the streams.	2026-04-28 16:36:28 +01:00
Michael Klemm	40ad10a8ae	[Flang] Fix -Wopen-mp-* and -Wopen-acc-* flag spellings (#188434 ) The CamelCase-to-hyphenated conversion was incorrectly splitting "OpenMP" and "OpenACC" into "open-mp" and "open-acc", producing wrong -W flag names like -Wopen-mp-usage instead of -Wopenmp-usage. Fix the conversion to treat these as compound names, keep the old spellings as deprecated aliases, and emit a warning when deprecated spellings are used. --------- Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>	2026-04-28 17:26:29 +02:00
Ryan Buchner	eecec864d1	[SLP] Add tests for boundary case with MinProfitableStridedOps (#194507 ) Currently we don't vectorize runtime strided loads when `VF == MinProfitableStridedOps`.	2026-04-28 08:19:57 -07:00
Luke Hutton	4c6ae8f5be	[mlir][tosa] Verify the output shape of tosa.mul and tosa.rescale (#193952 ) Verifying the provided output shape against an expected shape helps diagnose issues on op construction.	2026-04-28 16:11:04 +01:00
Benjamin Stott	d769ce2176	[Lit] Change processRedirects to open all files in binary mode (#194368 ) This PR is the second in a series of patches upgrading Lit's in-process built-ins to be able to run with piped input/output and full redirection support, and to allow custom in-process builtns to be provided via the Lit config. The remaining patches to Lit's test runner can be found here@ https://github.com/BStott6/llvm-project/compare/lit-inproc-builtins. This is part of the Lit daemonized testing project: https://discourse.llvm.org/t/88612. This PR makes Lit's `processRedirects` function open all input/output files in binary mode. This makes sure that in-process builtins have the expected behaviour when reading and writing from them: Newline translation is not required for any of the current in-process built-ins, in fact, the in-process built-in for `echo`, which is the only one that writes to `stdout`, explicitly re-opens the output file with `newline=""` on Windows, to avoid newline translation. Also, in-process builtins will eventually need to be able to read or write binary data: for example, `opt` without `-S` running in daemon mode. I believe this has no functional change for regular process invocations; I have confirmed that programs invoked by Lit which write to files opened in binary mode by Lit still have the newline translation performed as normal on Windows, unless they change the mode of their output stream themselves.	2026-04-28 15:50:05 +01:00
Daniel Hernandez-Juarez	5d48370b26	[mlir][memref] Pass TypeConverter to ConvertMemrefStore (#194356 ) Commit `20b925a28a` dropped the TypeConverter from ConvertMemrefStore when adding the disableAtomicRMW flag. Restore it.	2026-04-28 16:44:30 +02:00
Charles Zablit	a2f9da54cb	[lldb][windows] fix race condition in ConPTY on process exit (#194631 )	2026-04-28 15:39:47 +01:00
Srividya Sundaram	0052113fb6	[SYCL][Driver] Set -std=c++17 as default for SYCL compilations (#194014 ) This PR ensures SYCL compilations default to C++17 when no explicit standard is specified, and validates that user-provided standards meet SYCL's C++17 minimum requirement. It also fixes Windows MSVC compilation by enabling -fms-extensions for SYCL device code.	2026-04-28 14:39:26 +00:00
Nico Weber	d089e7397e	[gn build] Port `f5b6e4fc20` (#194645 )	2026-04-28 10:37:55 -04:00
Andrzej Warzyński	bfaab0ec1d	[clang][cir][nfc] Add missing comment (#194644 )	2026-04-28 15:37:29 +01:00
Krzysztof Parzyszek	268bac6c25	[flang][OpenMP] Move implementation detail from header to source, NFC (#194638 )	2026-04-28 09:29:31 -05:00
Adel Ejjeh	5b82a26451	[AMDGPU][NFC] Remove redundant Args.size() assertions from AMDGPUMCExpr (#194488 ) Remove redundant `Args.size()` assertions from `AMDGPUMCExpr` evaluate functions (`evaluateExtraSGPRs`, `evaluateTotalNumVGPR`, `evaluateAlignTo`, `evaluateOccupancy`). These assertions are redundant with the `zip_equal` size checking performed in the `evaluateMCExprs` helper function introduced in #193859. --- This PR was developed with AI assistance (GitHub Copilot).	2026-04-28 09:27:22 -05:00
Stephan T. Lavavej	77cfc55ed4	[MLIR] Update minimatch dependency in VSCode plugin, resolving security alerts (#188613 )	2026-04-28 07:12:59 -07:00
Alexey Bataev	61d795c797	[SLP][NFC]Cache IsExternallyUsed by Value in cost computation Same V is commonly seen in multiple TEs (shared scalars), and the expensive part of IsExternallyUsed walks V->users() with multiple match() pattern checks plus per-user getTreeEntries lookups - all V-only-dependent. Split out the V-dependent body and memoize by Value pointer, leaving the TE-specific copyable check at the call site. DeletedNodes is read-only during the cost loop, so caching is safe. Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/194637	2026-04-28 10:06:41 -04:00
Charles Zablit	b0f3cd1020	[lldb][windows] fix a race condition in IO reader thread (#194422 )	2026-04-28 14:57:56 +01:00
Simon Pilgrim	da0455adab	[PhaseOrdering][X86] vector-reductions-expanded.ll - use passes list instead of piped opt stages (#194608 ) Cleanup to make it easier to regenerate checks for #194473	2026-04-28 13:52:29 +00:00
Anshul Nigham	e6e9e1f528	[Docs] Fixes indents for InstrRefDebugInfo and KeyInstructionsDebugInfo (#194532 ) This distinguishes the doc title from the headers. Fixes navigation indents for Furo theme update (see https://github.com/llvm/llvm-project/pull/184440).	2026-04-28 06:44:47 -07:00
Felipe de Azevedo Piovezan	74781cf395	[lldb] Disable gdbremote test on windows (#194627 ) This is causing bot failures.	2026-04-28 14:40:22 +01:00
Steven Perron	ca27dc2933	[SPIR-V] Matrix in struct pointer legalization (#193073 ) When looking to load an object at the start of a struct, the types do not always match exactly. When we have an HLSL matrix the type in the load will not match the type in memory. We need to improve the pointer legalization pass to look for any "compatible" type at the start of an aggragate. A compatible are two types that the pass knows know to convert from one to another. This involves a refactoring of the code to make the check more general. Assisted-by: Gemini <!-- branch-stack-start --> <!-- branch-stack-end -->	2026-04-28 09:36:55 -04:00
Kai Nacke	e459ce5077	Revert "[PowerPC] Enable using HwMode for instructions (#191051 )" (#194464 ) This reverts commit `2a83068537`. It causes test suite failures in the 7zip benchmark.	2026-04-28 09:35:38 -04:00
Alexey Bataev	4e030aeec0	[SLP][NFC]Cache MightBeIgnored result in gather-shuffle analysis Each V in VL is queried up to 3 times for MightBeIgnored (direct + NeighborMightBeIgnored from both neighbors), and the underlying areAllUsersVectorized walks the instruction's user list. Memoize per Value pointer to avoid the redundant walks. Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/194619	2026-04-28 09:32:15 -04:00
Jeff Bailey	a2409e07ce	[libc][NFC] Move sys/ucontext.h to YAML generation (#194573 ) Renamed sys/ucontext.h to sys/ucontext.h.def and created a corresponding sys/ucontext.yaml, following the pattern used by sys/prctl. Updated CMakeLists.txt to use add_header_macro. Also removed the orphaned top-level ucontext.h.def which was never referenced by ucontext.yaml.	2026-04-28 13:29:08 +00:00
Naveen Seth Hanig	cd950962a9	[clang][modules-driver] Further constrain import-std test (#194604 ) The root cause for the failing test was found in https://github.com/llvm/llvm-project/pull/194475#issuecomment-4335023585. The test uses `--target=x86_64-linux-gnu` which is only available with `-DLLVM_TARGETS_TO_BUILD=all` or on native x86 targets.	2026-04-28 15:28:41 +02:00
Balázs Benics	b48aa05f39	[llvm] Mark IOSandbox::ScopedSetting nodiscard and maybe_unused (#194602 ) The goal is to have the same attributes on ScopedSetting regardless if this cmake setting is enabled or not. Both of these should have nodiscard and maybe_unused attributes.	2026-04-28 14:14:58 +01:00
Serosh	83164a43a1	[Clang] fix assertion failure in ::template operator parsing (#194097 ) when parsing an invalid `::template operator`, the parser incorrectly kept the consumed tokens on error. This caused the token cache to go out of sync and crash. This patch fixes it by reverting the tokens and properly returning the error fixes #186582	2026-04-28 06:13:26 -07:00
Alexey Bataev	ad2390871a	[SLP][NFC]Cache isUsedOutsideBlock results in gather-shuffle analysis Hoist loop-invariant predicates and memoize per-UserTE all_of(Scalars, isUsedOutsideBlock) in isGatherShuffledSingleRegisterEntry and vectorizeTree to avoid redundant walks over scalar user lists in the gather-shuffle hot path. Reviewers: Pull Request: https://github.com/llvm/llvm-project/pull/194612	2026-04-28 09:05:27 -04:00
jeanPerier	a6df7eb063	[flang] allow rebox/embox of OPTIONAL (#194319 ) Delay materialization of branches when building local temporary descriptor for OPTIONAL from hlfir-to-fir until pre-cg-rewrite. This makes the IR easier to analyze with OPTIONAL (for instance alias analysis does not need to handle the branches to find the source). This is done by adding an "optional" attribute to fir.embox, fir.rebox, and fir.rebox_assumed_rank to indicate that their cogeneration must be conditional. The conditional aspect is implemented in pre-cg-rewrite to avoid complexifying codegen and the fir.cg dialect. Assisted by: Claude	2026-04-28 15:00:46 +02:00
Timm Baeder	9d3f2372ac	[clang][bytecode] Don't start record field lifetime by default (#193496 ) Even though we have per-field lifetime information we did not previously diagnose this test: ```c++ struct R { struct Inner { constexpr int f() const { return 0; } }; int a = b.f(); Inner b; }; constexpr R r; ``` because the life time was started by default. This patch makes record members be `Lifetime::NotStarted` by default (unless they are primitive arrays) and then starts the lifetime when in `Pointer::initialize()`.	2026-04-28 14:57:59 +02:00
Simon Pilgrim	34e136bc84	[PhaseOrdering][X86] Copy backend horizontal min/max reduction tests to phaseordering (#194601 ) As discussed on #194473 - add middleend test coverage to ensure we're creating vXi8/vXi16 llvm.vector.reduce calls to ensure we can lower to PHMINPOS instructions Also demonstrates that we're still not matching partial reduction patterns in vectorcombine	2026-04-28 13:55:24 +01:00
hev	c5e941d722	[LoongArch] Support VBIT{CLR,SET,REV}I patterns for non-native element sizes (#193719 ) Extend vsplat_uimm_{pow2,inv_pow2} matching to allow specifying an explicit element bit width, enabling recognition of splat patterns whose logical element size differs from the vector's native element type. Introduce templated selectVSplatUimm{Pow2,InvPow2} helpers with an optional EltSize parameter, and add corresponding ComplexPattern definitions for i8/i16/i32 element widths. This allows TableGen patterns to match cases such as operating on v8i32/v4i64 vectors with masks derived from smaller element sizes. With these changes, AND/OR/XOR operations using inverse power-of-two or power-of-two splat masks are now correctly selected to VBITCLRI, VBITSETI, and VBITREVI instructions instead of falling back to vector logical operations with materialized constants.	2026-04-28 20:52:22 +08:00
Amilendra Kodithuwakku	378b411cf2	[clang][AArch64][SVE2p3][SME2p3] Add intrinsics for v9.7a shift operations (#186087 ) Add the following new clang intrinsics based on the ACLE specification https://github.com/ARM-software/acle/pull/428 (Add alpha support for 9.7 data processing intrinsics) Multi-vector saturating rounding shift right narrow and interleave instructions - SQRSHRN - svint8_t svqrshrn_s8(svint16x2_t, uint64_t) / svint8_t svqrshrn_n_s8_s16_x2(svint16x2_t, uint64_t) - UQRSHRN - svuint8_t svqrshrn_u8(svuint16x2_t, uint64_t) / svuint8_t svqrshrn_n_u8_u16_x2(svuint16x2_t, uint64_t) - SQRSHRUN - svuint8_t svqrshrun_u8(svint16x2_t, uint64_t) / svuint8_t svqrshrun_n_u8_s16_x2(svint16x2_t, uint64_t) Multi-vector saturating shift right narrow and interleave - SQSHRN - svint8_t svqshrn_s8(svint16x2_t, uint64_t) / svint8_t svqshrn_n_s8_s16_x2(svint16x2_t, uint64_t) - svint16_t svqshrn_s16(svint32x2_t, uint64_t) / svint16_t svqshrn_n_s16_s32_x2(svint32x2_t, uint64_t) - UQSHRN - svuint8_t svqshrn_u8(svuint16x2_t, uint64_t) / svuint8_t svqshrn_n_u8_u16_x2(svuint16x2_t, uint64_t) - svuint16_t svqshrn_u16(svuint32x2_t, uint64_t) / svuint16_t svqshrn_n_u16_u32_x2(svuint32x2_t, uint64_t) - SQSHRUN - svuint8_t svqshrun_u8(svint16x2_t, uint64_t) / svuint8_t svqshrun_n_u8_s16_x2(svint16x2_t, uint64_t) - svuint16_t svqshrun_u16(svint32x2_t, uint64_t) / svuint16_t svqshrun_n_u16_s32_x2(svint32x2_t, uint64_t)	2026-04-28 13:48:48 +01:00
Zhige Chen	c28d9076ec	[llubi] Implement vector reduction/manipulation intrinsics (#194345 ) This PR implements vector reduction and manipulation intrinsics. Note that floating-point vector reduction intrinsics are not covered by this change; they will be added in a follow-up PR after #188453 is merged.	2026-04-28 14:39:58 +02:00

1 2 3 4 5 ...

578498 Commits