llvm-project

Author	SHA1	Message	Date
Yury Plyakhin	81f445b7df	[clang-sycl-linker][offload] Set TheImageKind based on IsAOTCompileNeeded flag (#180269 ) Previously, TheImageKind was set to IMG_None and relied on a runtime heuristic to determine the correct image type. This commit sets it explicitly to IMG_Object for AOT-compiled images and IMG_SPIRV for SPIR-V images based on the IsAOTCompileNeeded flag. Also it adds test for this change, which required minor changes in OffloadBinary and OffloadDump.	2026-02-11 10:21:22 -08:00
Nikolas Klauser	8c93fb0c05	[libc++] Refactor benchmarking std::make_heap and std::sort_heap together (#180935 ) We're trying to get the time it takes to run all the benchmarks down, so that we can run them on a regular basis. This patch saves us ~18 minutes per run.	2026-02-11 13:17:44 -05:00
Nikolas Klauser	bc1e852241	[libc++] Refactor std::sort_heap benchmark (#180941 ) We're trying to get the time it takes to run all the benchmarks down, so that we can run them on a regular basis. This patch saves us ~80 seconds per run.	2026-02-11 13:17:28 -05:00
Nikolas Klauser	f32fd56f0e	[libc++] Reduce the number of runs on the ranges::min{,max} benchmarks (#179912 ) Testing a bunch of range sizes has relatively little value. This reduces the number of benchmarks so we can run them on a regular basis. This saves ~10 minutes when running benchmarks. Fixes #179698	2026-02-11 13:17:03 -05:00
Xing Xue	f6bdc928a8	[libc++][AIX] Enable LIBCPP_SHARED_PTR_DEFINE_LEGACY_INLINE_FUNCTIONS on AIX (#179784 ) This PR enables the `LIBCPP_SHARED_PTR_DEFINE_LEGACY_INLINE_FUNCTIONS` macro on AIX because the functions guarded by this macro are required for backward compatibility.	2026-02-11 13:03:15 -05:00
Justin Fargnoli	45412b6790	[LoopUnrollPass] Indent `LLVM_DEBUG()` messages based on our depth in the `tryToUnrollLoop()` call graph (#178945 ) Unify the ad-hoc use of whitespace in `LLVM_DEBUG()` messages. This approach should also make it easier to see which loop debug messages correspond to and which part of the loop unrolling heuristics each message corresponds to.	2026-02-11 17:59:42 +00:00
Charles Zablit	8d34f2851f	[lldb-dap][windows] add --check-python command (#180784 ) Implement [`[RFC] Add a Python check command to lldb-dap and lldb`](https://discourse.llvm.org/t/rfc-add-a-python-check-command-to-lldb-dap-and-lldb/88972). This patch adds the `--check-python` flag to `lldb-dap` on Windows, to allow dap clients to verify the correct Python version is available before trying to start `lldb-dap`. Depends on: - https://github.com/llvm/llvm-project/pull/180786 rdar://165442474	2026-02-11 18:48:31 +01:00
Dave Lee	52bcdc6ad1	[lldb] Implement bytecode based SyntheticChildren (#179832 ) Initial implementation of a [bytecode][1] synthetic provider. This is a follow up to https://github.com/llvm/llvm-project/pull/114333 which implemented the bytecode interpreter, support for summary formatters, and more. rdar://169727764 [1]: https://lldb.llvm.org/resources/formatterbytecode.html	2026-02-11 09:43:59 -08:00
Demetrius Kanios	7323cb3d69	[WebAssembly] Add a WASM table to `llvm/test/MC/WebAssembly/wasm64.s`. NFC (#180861 ) Adds an `externref` WASM table to the MC level testing of Wasm64 to make sure it emits the table with the `IS_64` (64-bit/i64 indices) flag.	2026-02-11 09:39:06 -08:00
Hank	e1b9d033cd	[MLIR][Math] Fix math.ceil expansion to avoid undefined behavior on Inf/NaN (#170028 ) Fixes #151786 The original `ceilf` expansion lowers to `fptosi`, which produces poison for Inf, and any subsequent use leads to undefined behavior. This patch adds a safe path, similar to the existing `round` expansion, for large or special inputs and avoids the UB.	2026-02-11 12:36:32 -05:00
vangthao95	90a56a18ee	AMDGPU/GlobalISel: RegBankLegalize for global atomic ordered add (#180829 )	2026-02-11 09:30:11 -08:00
Simon Pilgrim	182eb9d21a	[X86] Move getTargetVShift helpers earlier in the source file. NFC. (#180972 ) Avoid having to add forward declarations for earlier functions to use them.	2026-02-11 17:19:32 +00:00
Harrison Hao	aad0ae4a39	[CMake] Only pass PYTHON_EXECUTABLE to native build if defined (#180964 ) This addresses the following warning when PYTHON_EXECUTABLE is not set in the host build: ```bash CMake Warning: Manually-specified variables were not used by the project: PYTHON_EXECUTABLE ``` Reference: https://github.com/llvm/llvm-project/pull/163574	2026-02-12 01:18:44 +08:00
vangthao95	0286641f5c	[AMDGPU] Add known bits for G_AMDGPU_COPY_SCC_VCC (#180560 )	2026-02-11 09:06:28 -08:00
Michael Liao	cdc1f8afe8	[CIR] Match codegen change on fmin and fmax - #113133 adds 'nsz' on the emitted 'llvm.minmum'/'llvm.maxnum' from fmin/fmax following the semantic clarification from #112852.	2026-02-11 12:03:40 -05:00
Matthias Springer	34eb59dd4b	[mlir][IR][NFC] Simplify "splat" handling in `DenseIntOrFPElementsAttr` (#180965 ) Since #180397, all elements of a `DenseIntOrFPElementsAttr` are padded to full bytes. This enables additional simplifications: whether a `DenseIntOrFPElementsAttr` is a splat or not can now be inferred from the size of the buffer. This was not possible before because a single byte sometimes contained multiple `i1` elements. Discussion: https://discourse.llvm.org/t/denseelementsattr-i1-element-type/62525	2026-02-11 17:59:20 +01:00
David Green	14fe03f6c6	[AArch64] Add extra fcmp+select tests. NFC Some of these currently get scalarized with nnan.	2026-02-11 16:49:33 +00:00
Justice Adams	91dae0efec	[green dragon] fix typo in clang-stage1-RA	2026-02-11 08:33:14 -08:00
Folkert de Vries	6a81656f7d	[RISCV] improve `musttail` support (#170547 ) Basically https://github.com/llvm/llvm-project/pull/168506 but for riscv, so to be clear the hard work here is @heiher 's. I figured we may as well get some extra eyeballs on this from riscv too. Previously the riscv backend could not handle `musttail` calls with more arguments than fit in registers, or any explicit `byval` or `sret` parameters/return values. Those have now been implemented. This is part of my push to get more LLVM backends to support `byval` and `sret` parameters so that rust can stabilize guaranteed tail call support. See also: - https://github.com/llvm/llvm-project/pull/168956 - https://github.com/rust-lang/rust/issues/148748 --------- Co-authored-by: WANG Rui <wangrui@loongson.cn>	2026-02-11 17:27:51 +01:00
Justice Adams	2c1af70c06	[Green Dragon] add green dragon jenkinsfile definitions for multibranch pipelines (#180793 ) Add CI job definitions using our new templated pipelines to llvm-project, this way we can enable multi branch pipelines which trigger for changes on a given branch. By storing the Jenkinfile definitions in llvm-project, we gain the benefit of enabling Jenkins multi branch pipelines. This means in the future, expanding a job configuration to build with a new branch is as simple as updating a regular expression in Jenkins (the regular expression represents which branches should be built). The work required for enabling testing new branches becomes minimal, and furthermore we would have a great deal of confidence that job configurations across branches remain identical. I will verify these new jenkinsfiles work before deprecating the old definitions in zorg	2026-02-11 08:19:25 -08:00
Delaram Talaashrafi	208b4dbf12	[flang][optimize] Use ArraySectionAnalyzer to better handle aliasing sections (#180595 ) When alias analysis reports potential aliasing between LHS and RHS when inlining `hlfir.assign`, use `ArraySectionAnalyzer` to determine if the sections are disjoint or identical, which is safe for element-wise assignment. Co-authored-by: Delaram Talaashrafi <dtalaashrafi@rome5.pgi.net>	2026-02-11 07:59:38 -08:00
Matthias Springer	c6964b1b4d	[mlir][IR] `DenseElementsAttr`: Remove `i1` dense packing special case (#180397 ) `DenseElementsAttr` stores elements in a `ArrayRef<char>` buffer, where each element is padded to a full byte. Before this commit, there used to be a special storage format for `i1` elements: they used to be densely packed, i.e., 1 bit per element. This commit removes the dense packing special case for `i1`. This commit removes complexity from `DenseElementsAttr`. If dense packing is needed in the future it could be implemented in a general way that works for all element types (based on #179122). Discussion: https://discourse.llvm.org/t/denseelementsattr-i1-element-type/62525	2026-02-11 15:56:08 +00:00
Karl Friebel	0776af16c5	[Clang] [Sema] Fix FixIt for implicit-int diagnostics. (#179356 ) When encountering a declaration without a type specifier, in contexts where they could reasonably be assumed to default to int, clang emits a diagnostic with FixIt. This FixIt does not produce working code. This patch updates `SemaType` to correctly insert a single int type specifier per group of declarations, and adds coverage in the FixIt lit test suite. Fixes #179354	2026-02-11 15:47:36 +00:00
David Spickett	a2149815cc	[lldb] Move parts of OutputFormattedUsageText into utility function (#180947 ) As seen in #177570, this code has a bunch of corner cases, does not handle ANSI codes properly and does not handle unicode at all. That's enough to fix that we need some tests to make it clear where we're starting from. The body of OutputFormattedUsageText is moved into a utility in the AnsiTerminal.h header and tests added to the existing AnsiTerminalTest.cpp. Some results are known to be wrong. Some that cause crashes are commented out, to be enabled once fixed.	2026-02-11 15:43:40 +00:00
Florian Hahn	54177e95d1	[Matrix] Use tiled loops automatically for large kernels. (#179325 ) Update LowerMatrixIntrinsics to use tiled loops automatically in for larger matrixes. The fully unrolled codegen creates a huge amount of code, which performs noticably worse then the tiled loop nest variant. We new try to estimate the number of instructions needed for the multiply, and if it is too large, tiled loops are used. The current threshold is anything roughly larger than 6x6x6 double multiply. Eventually I think we want to only generate tiled loops. This patch is a first step, trying to opt in for cases where we know it is beneficial. Checked on AArch64, but should help on other architectures similarly, and also drastically reduce binary size + compile time. PR: https://github.com/llvm/llvm-project/pull/179325	2026-02-11 15:36:34 +00:00
Frank Schlimbach	f5e5745e86	[mlir][shard, mpi] Allow more than one last axis to be "unsplit" (#180754 ) A resharding pattern allowed only a single trailing axis to be "unsplit". This PR allows multiple trailing axes to be "unsplit".	2026-02-11 16:31:29 +01:00
Aiden Grossman	768cc03cee	[ProfCheck Add WinEH Tests to XFail List This pass recently had NewPM coverage added which means we now can see profcheck issues with the pass. Disable it for now until we can get it fixed, although its not crucial for anything given it is only run for 32-bit X86 Windows.	2026-02-11 14:31:45 +00:00
Eugene Epshteyn	a58268a77c	[flang][NFC] Converted five tests from old lowering to new lowering (part 16) (#180866 ) Tests converted from test/Lower: fail_image.f90, test/Lower/forall: array-constructor.f90, array-pointer.f90, array-subscripts.f90, character-1.f90	2026-02-11 09:28:59 -05:00
Charles Zablit	d7e5a7dacd	[lldb-dap][windows] drain the ConPTY before attaching (#180578 ) Add a step to drain the init sequences emitted by the ConPTY before attaching it to the debuggee. A ConPTY (PseudoConsole) emits init sequences which flush the screen and contain the name of the program (ESC[2J for clear screen, ESC[H for cursor home and more). It's not desirable to filter them out: if a debuggee also emits them, lldb would filter that output as well. To work around this, the ConPTY is drained by attaching a dummy process to it, consuming the init sequences and then attaching the actual debuggee. --------- Co-authored-by: Nerixyz <nero.9@hotmail.de>	2026-02-11 14:17:17 +00:00
Matt Arsenault	0b0dca5668	clang/AMDGPU: Do not look for rocm device libs if environment is llvm (#180922 ) clang/AMDGPU: Do not look for rocm device libs if environment is llvm Introduce usage of the llvm environment type. This will be useful as a switch to eventually stop depending on externally provided libraries, and only take bitcode from the resource directory. I wasn't sure how to handle the confusing mess of -no-* flags. Try to handle them all. I'm not sure --no-offloadlib makes sense for OpenCL since it's not really offload, but interpret it anyway.	2026-02-11 15:16:26 +01:00
Joseph Huber	6d6feb7655	[libc] Add RPC helpers for dispatching functions to the host (#179085 ) Summary: The RPC interface is useful for forwarding functions. This PR adds helper functions for doing a completely bare forwarding of a function from the client to the server. This is intended to facilitate heterogenous libraries that implement host functions on the GPU (like MPI or Fortran).	2026-02-11 08:13:52 -06:00
Steven Perron	3f73f839e2	[HLSL] Implement Sample* methods for Texture2D (#179322 ) This commit implement the methods: - SampleBias - SampleCmp - SampleCmpLevelZero - SampleGrad - SampleLevel They are added to the Texture2D resource type. All overloads except for those with the `status` argument. Part of https://github.com/llvm/llvm-project/issues/175630 Assisted-by: Gemini --------- Co-authored-by: Helena Kotas <hekotas@microsoft.com>	2026-02-11 09:12:22 -05:00
David Spickett	4677fc3d83	Revert "[lldb] Step over non-lldb breakpoints" (#180944 ) Reverts llvm/llvm-project#174348 due to reported failures on MacOS and Arm 32-bit Linux.	2026-02-11 14:03:01 +00:00
Nikita Popov	a8f211904a	[ExpandIRInsts] Support saturating fptoi (#179710 ) Add support for expanding fptosi.sat and fptoui.sat via IR expansions. Similar to fptosi/fptoui we would get legalization errors otherwise. The previous expansion for fptosi/fptoui was already saturating -- but those instructions do not actually require saturation, and the implementation of the saturation was incorrect in lots of ways. What this PR does is: * For fptosi, remove the unnecessary saturation handling. * For fptoui, remove the unnecessary saturation handling and sign multiplication. * For fptosi, use the previous saturation handling with fixes: We need to map NaNs to 0 and the saturation condition on the exponent was incorrect. (I'm performing the NaN check via fcmp -- there's no requirement to do everything bitwise here.) * For fptoui use a variation of the signed saturation handling: Negative values need to go to zero and we saturate to unsigned max. Proofs: https://alive2.llvm.org/ce/z/Xv9FNd	2026-02-11 14:51:30 +01:00
Eugene Epshteyn	98fcc11d1a	[flang][NFC] Converted five tests from old lowering to new lowering (part 17) (#180869 ) Tests converted from test/Lower: goto-do-body.f90, mixed_loops.f90, while_loop.f90 From test/Lower/forall: degenerate.f90, forall-2.f90	2026-02-11 08:50:34 -05:00
Tomer Shafir	5456d6352a	[AArch64] Lower factor-of-2 interleaved stores to STNP (#177938 ) This patch prioritizes lowering to `stnp` over `st2` store instructions marked !nontemporal. From performance perspective, we should conservatively prioritize STNP lowering for non-temporal stores, because currently NT stores requires explicit usage of `__builtin_nontemporal_store()` intrinsic, so I think its reasonable to assume the developer explicitly intends to optimize D-cache usage of some hot non-temporal execution. He can rollback if it doesnt help. The cost here is it adds a few instructions for code size (thus we predicate when not optimizing for code size), few extra fast instructions to execute, few extra short dep chains - should be commonly handled by OOO execution, I-cache alignment effects, few extra registers. In the future we can may be able to approximate a cost model to select by. The patch implements an AArch64 specific static function to model what NT stores are directly legal on the backend currently, and `AArch64TargetLowering::lowerInterleavedStore` to conditionally skip st2 lowering.	2026-02-11 15:37:13 +02:00
Matt Arsenault	6af11dba3c	clang/AMDGPU: Remove dead code in RocmInstallationDetector (#180920 ) The defaulted constructor argument isn't used anywhere, so this path is unreachable.	2026-02-11 14:29:26 +01:00
Charles Zablit	0ec4aa51dd	[lldb][windows] switch to using std::string instead of std::wstring in Python setup (#180786 ) This patch changes the return type of methods returning `std:wstring` to `std::string` in `PythonPathSetup.cpp`. This follows lldb's style of converting to `std::wstring` at the last moment.	2026-02-11 14:23:38 +01:00
Brian Cain	99c9e5ebd6	[Hexagon] Fix signed constant creation in EmitVAArgFromMemory (#180385 ) Use ConstantInt::getSigned instead of ConstantInt::get when creating a negative alignment mask in EmitVAArgFromMemory. This is the same fix as commit `8546294db9` (PR #176115) which addressed the issue in EmitVAArgForHexagonLinux. Added a test case that exercises the EmitVAArgFromMemory alignment path using a struct that is both >8 bytes (to trigger EmitVAArgFromMemory) and has 8-byte alignment (to trigger the alignment masking code).	2026-02-11 07:19:43 -06:00
Juan Manuel Martinez Caamaño	7f2b875c7d	[SPIRV] Replace `SPIRVType` with `SPIRVTypeInst` as much as we can (#180721 ) Second part of https://github.com/llvm/llvm-project/pull/179947 where we use `SPIRVTypeInst` as much as we can. Co-authored-by: Cursor <cursoragent@cursor.com>	2026-02-11 14:05:08 +01:00
Jordan Rupprecht	4a2a450cb7	[bazel][lldb] Port `304c680809` (#180931 ) Co-authored-by: Pranav Kant <prka@google.com>	2026-02-11 07:02:17 -06:00
Alexey Bataev	54cdd903b8	[SLP]Skip operands comparing on non-matching (but compatible) instructions If the instructions are compatible but non-matching (zext-select pair as example), no need to perform operands analysis, just return that they are matching.	2026-02-11 04:55:29 -08:00
Renato Golin	81e0de2ea5	[MLIR][Arith] FastMath extf conversion without NaN checks (#180926 ) This PR allows the expand op converter to consider the NoNaN fastmath attribute to disable the runtime checks for NaNs in E8M0 types. Default behaviour is still the same. The OCP document provides all-ones as NaN for E8M0, but for pre-MX I8 quantization, the checks for NaNs are prohibitively expensive, especially if the hardware doesn't have native support for that type.	2026-02-11 12:46:07 +00:00
Younan Zhang	924f773f5e	[Clang] Don't diagnose missing members when looking at the instantiating class template (#180725 ) The perfect matching patch revealed another bug where recursive instantiations could lead to the escape of SFINAE errors, as shown in the issue. Fixes https://github.com/llvm/llvm-project/issues/179118	2026-02-11 20:42:11 +08:00
Twice	972aa597de	[MLIR][Python] Make traits declarative in python-defined operations (#180748 ) This will support two syntax in python-defined dialects. First is that traits can now be declared in class parameters, e.g. ```python class ParentIsIfTrait(DynamicOpTrait): #define a python-side trait @staticmethod def verify_invariants(op) -> bool: if not isinstance(op.parent.opview, IfOp): op.location.emit_error( f"{op.name} should be put inside {IfOp.OPERATION_NAME}" ) return False return True class YieldOp( # attach two traits: IsTerminatorTrait, ParentIsIfTrait TestRegion.Operation, name="yield", traits=[IsTerminatorTrait, ParentIsIfTrait] ): ... ``` Second is that users can directly define `verify_invariants`/`verify_region_invariants` methods in the operation to add additional custom verification logic. And this is implemented via traits. ```python class YieldOp(TestRegion.Operation, name="yield", ...): value: Operand[Any] def verify_invariants(self) -> bool: # define a method directly if self.parent.results[0].type != self.value.type: self.location.emit_error( "result type mismatch between YieldOp and its parent IfOp" ) return False return True ``` Previously we use `verify`/`verify_region` as method names (in yesterday's PR #179705), but in this PR they are renamed to `verify_invariants`/`verify_region_invariants` because there are conflicts between the newly-added `verify` method and `ir.OpView.verify` method: - `verify_invariants` is just to attach additional verification logic. but `OpView.verify` is to construct an OperationVerifer and do full verification for an operation, so the semantics is not same between these two. We should not shadow the `OpView.verify` method by defining a new semantically-different `verify` method. - it will make users confuse between these two `verify` methods, since they have different meaning. - if users didn't define the `verify` method in their python-defined operation, `DynamicOpTraits.attach(opname, MyOpCls)` still do the attaching (because `hasattr("verify")` returns `True`) and seg fault (because we cannot attach `OpView.verify`). --------- Co-authored-by: Rolf Morel <rolfmorel@gmail.com>	2026-02-11 20:39:58 +08:00
LLVM GN Syncbot	9a07b1c84e	[gn build] Port `5e5b799853`	2026-02-11 12:19:44 +00:00
Minsoo Choo	5e5b799853	[lldb][NativeRegisterContext] Rename to x86 for shared files (#180624 )	2026-02-11 07:19:34 -05:00
Luke Hutton	3123d9cb3a	[mlir][tosa] Fix validation of `dim` op when reliant on datatype extension (#180915 ) For example: ``` error: 'tosa.dim' op illegal: requires [bf16, shape] but not included in the profile compliance [shape] %0 = tosa.dim %arg0 {axis = 4 : i32} : (tensor<4x5x8x8x6x4xbf16>) -> !tosa.shape<1> ``` Here dim requires support to be declared for the BF16 and SHAPE extensions, but only SHAPE was specified in the op declaration.	2026-02-11 12:15:14 +00:00
Florian Hahn	2dcf858ba0	[LAA] Use SCEVPtrToAddr in tryToCreateDiffChecks. (#178861 ) The checks created by LAA only compute a pointer difference and do not need to capture provenance. Use SCEVPtrToAddr instead of SCEVPtrToInt for computations. To avoid regressions while parts of SCEV are migrated to use PtrToAddr this adds logic to rewrite all PtrToInt to PtrToAddr if possible in the created expressions. This is needed to avoid regressions. Similarly, if in the original IR we have a PtrToInt, SCEVExpander tries to re-use it if possible when expanding PtrToAddr. Depends on https://github.com/llvm/llvm-project/pull/178727. Fixes https://github.com/llvm/llvm-project/issues/156978. PR: https://github.com/llvm/llvm-project/pull/178861	2026-02-11 11:51:51 +00:00
Nikolas Klauser	064694160f	[libc++] Rewrite the std::pop_heap benchmark (#179911 ) Testing a bunch of random types has relatively little value. This reduces the number of benchmarks so we can run them on a regular basis. This saves ~90 seconds when running the benchmarks.	2026-02-11 12:47:33 +01:00

1 2 3 4 5 ...

568824 Commits