171 Commits

Author SHA1 Message Date
Fangrui Song
13e98d8341 [MC] Take MCAsmInfo by reference in MCContext and TargetMachine. NFC (#194280)
Both MCContext::MCContext and TargetMachine::getMCAsmInfo treat
MCAsmInfo as a pointer that must be non-null. Make the contract
explicit:

* MCContext's constructor takes `const MCAsmInfo &MAI`.
* TargetMachine::getMCAsmInfo returns `const MCAsmInfo &`.

Make this change now since the MCContext ctor has recently been updated.
2026-04-27 07:48:54 +00:00
Fangrui Song
33f2036f35 [MC] Add MCTargetOptions to MCAsmInfo constructor. NFC (#194200)
Since #180464 the canonical MCTargetOptions pointer is stored in
MCAsmInfo, but it is bound after construction via `setTargetOptions`
called from TargetRegistry::createMCAsmInfo.

Direct constructions in unit tests can leave the pointer null, leading
to a runtime assert failure. Add MCTargetOptions to every MCAsmInfo
subclass constructor, store it as a reference in MCAsmInfo, and remove
`setTargetOptions()`.
2026-04-26 05:52:32 +00:00
Farid Zakaria
ec1e3aef9a [BOLT] Update LSDA encoding for x86-64 large code model (#190685)
BOLT hardcoded 4-byte LSDA (exception table) encoding for x86-64. This
is insufficient for large code model binaries where functions in .ltext
sections may be placed at addresses above 2GB, exceeding the range of
DW_EH_PE_udata4/DW_EH_PE_sdata4 encodings.

Detect large code model by checking for .ltext sections
(SHF_X86_64_LARGE) and update LSDAEncoding to use 8-byte pointers:
- Non-PIC: DW_EH_PE_absptr (8-byte absolute)
- PIC: DW_EH_PE_pcrel | DW_EH_PE_sdata8 (8-byte PC-relative)

This was pulled out from
https://github.com/llvm/llvm-project/pull/190637
2026-04-16 00:34:08 -07:00
Sergei Barannikov
f4e1a51d10 [bolt] Remove unused argument of DataExtractor constructor (NFC) (#191841)
`AddressSize` parameter is not used by `DataExtractor` and will be
removed in the future. See #190519 for more context.

I took the liberty of switching from using the `StringRef` constructor
overload to `ArrayRef` where appropriate.
2026-04-14 08:13:54 +03:00
Fangrui Song
1578bc684e [MC] Move MCTargetOptions pointer from MCContext to MCAsmInfo (#180464)
Except MC-internal `MCAsmInfo()` uses, MCAsmInfo is always constructed
with `const MCTargetOptions &` via `TargetRegistry::createMCAsmInfo`
(https://reviews.llvm.org/D41349). Store the pointer in MCAsmInfo and
change `MCContext::getTargetOptions()` to retrieve it from there,
removing the `MCTargetOptions const *TargetOptions` member from
MCContext.

MCContext's constructor still accepts an MCTargetOptions parameter
for now but is often omitted by call sites.
A subsequent change will remove this parameter and update all callers.
2026-04-08 04:35:58 +00:00
Fangrui Song
d1b9b4c548 [MC] Remove unused NoExecStack parameter from MCStreamer::initSections. NFC (#188184)
Unused after commit 34bc5d580b
2026-03-24 07:42:09 +00:00
Fangrui Song
c889454f1d [MC] Rename PrivateGlobalPrefix to InternalSymbolPrefix. NFC (#185164)
The "private global" terminology, likely came from
llvm/lib/IR/Mangler.cpp, is misleading: "private" is the opposite of
"global", and these prefixed symbols are not global in the object file
format sense (e.g. ELF has STB_GLOBAL while these symbols are always
STB_LOCAL). The term "internal symbol" better describes their purpose:
symbols for internal use by compilers and assemblers, not meant to be
visible externally.

This rename is a step toward adopting the "internal symbol prefix"
terminology agreed with GNU as
(https://sourceware.org/pipermail/binutils/2026-March/148448.html).
2026-03-10 01:03:27 -07:00
Asher Dobrescu
7bce678ec1 [BOLT] Check if symbol is in data area of function (#160143)
There are cases in which `getEntryIDForSymbol` is called, where the
given Symbol is in a constant island, and so BOLT can not find its
function. This causes BOLT to reach `llvm_unreachable("symbol not
found")` and crash. This patch adds a check that avoids this crash.
2026-03-06 10:37:54 +00:00
YongKang Zhu
95685ca52e [BOLT] Retain certain local symbols (#184074)
BOLT currently strips all STT_NOTYPE STB_LOCAL zero-sized symbols
that fall inside function bodies. Certain such symbols are named
labels (loop markers and subroutine entry points) or local function
symbols in hand-written assembly. We now keep them in local symbol
table in BOLT processed binaries for better symbolication.
2026-03-05 00:34:36 -08:00
Fangrui Song
6f0b0ecaba [NFC] Ensure MCTargetOptions outlives MCAsmInfo at createMCAsmInfo call sites (#180465)
Preparatory change for storing the MCTargetOptions pointer in MCAsmInfo
(#180464)
2026-02-17 21:48:22 -08:00
Maksim Panchenko
f80e3b3d7e [BOLT] Keep folded functions in BinaryFunctions map. NFC (#180392)
In relocation mode, keep folded functions in the BinaryFunctions map
instead of erasing them. Mark them as folded using setFolded() and skip
emitting them.
2026-02-10 14:56:26 -08:00
Maksim Panchenko
adaca1348e [BOLT] Introduce getOutputBinaryFunctions(). NFCI (#172174)
To gain better control over the functions that go into the output file
and their order, introduce `BinaryContext::getOutputBinaryFunctions()`.

The new API returns a modifiable list of functions in output order.

This list is filled by a new `PopulateOutputFunctions` pass and includes
emittable functions from the input file, plus functions added by BOLT
(injected functions).

The new functionality allows to freely intermix input functions with
injected ones in the output, which will be used in new PRs.

The new function replaces `BinaryContext::getSortedFunctions()`, but
unlike its predecessor, it includes injected functions in the returned
list.
2025-12-14 16:29:01 -08:00
Maksim Panchenko
3c2f81820c [BOLT] Introduce BinaryFunctionListType. NFC (#172128)
Use `BinaryFunctionListType` as an alias for `std::vector<BinaryFunction *>`.
2025-12-13 11:52:36 -08:00
Maksim Panchenko
6470d1bb98 [BOLT] Exclude BOLT injected functions from AssignSections. NFCI (#171579)
Assign output sections for injected functions explicitly, and don't
reassign in AssignSections pass.

This change is a prerequisite for further PRs where veneer functions are
created as injected functions and their code section depends on their
placement.
2025-12-10 10:30:07 -08:00
Maksim Panchenko
dda715df2d [BOLT][DWARF] Improve reporting on missing DWOs (#171506)
List all required missing DWO files and report a summary with
recommendations on how to proceed.
2025-12-09 15:46:43 -08:00
Jinjie Huang
33e0301b07 [BOLT] Add validation for direct call/branch targets (#165406)
In some edge cases, a binary may contain direct `branch` or `call`
instructions whose target do not point to a valid executable
instruction. This can occur due to compiler bugs, hand-written assembly,
obfuscation technique, **or when control flow targets a data by
mistake.**

We also encountered the problems as described in this
[issue](https://github.com/llvm/llvm-project/issues/149382), where "data
in code" within OpenSSL's hand-written assembly was misidentified as
instructions(island identification seems fail due to the absence of a
corresponding data symbol). The problem occurred because a data sequence
was incorrectly disassembled as a "jb" instruction.

The point here is that the data should not be pointed to by any edge, so
this patch tries to address this by validating the destination address
for **direct branches and calls**. If the target instruction is
invalid(implies a corrupted control flow), this function will be set
ignored.

Although this approach appears helpful for addressing the 'data in code'
problem, its validation might be compromised if the data can be
disassembled as normal instruction.
2025-12-09 16:17:19 +08:00
Maksim Panchenko
97c4f367b8 [BOLT] Fix comments for interprocedural branches. NFC (#170745) 2025-12-05 10:53:38 -08:00
Maksim Panchenko
af456dfa11 [BOLT] Refactor tracking internals of BinaryFunction. NFCI (#167074)
In addition to tracking offsets inside a `BinaryFunction` that are
referenced by data relocations, we need to track those relocations too.
Plus, we will need to map symbols referenced by such relocations back to
the containing function.

This change introduces `BinaryFunction::InternalRefDataRelocations` to
track the aforementioned relocations and expands
`BinaryContext::SymbolToFunctionMap` to include local/temp symbols
involved in relocation processing.

There is no functional change introduced that should affect the output.
Future PRs will use the new tracking capabilities.
2025-11-08 00:31:03 -08:00
YongKang Zhu
6fce53af84 [BOLT][AArch64] Skip as many zeros as possible in padding validation (#166467)
We are skipping four zero's at a time when validating code padding in
case that the next zero would be part of an instruction or constant
island, and for functions that have large amount of padding (like due to
hugify), this could be very slow. We now change the validation to skip
as many as possible but still need to be 4's exact multiple number of
zero's. No valid instruction has encoding as 0x00000000 and even if we
stumble into some constant island, the API
`BinaryFunction::isInConstantIsland()` has been made to find the size
between the asked address and the end of island (#164037), so this
should be safe.
2025-11-06 09:38:25 -08:00
YongKang Zhu
562e3bfcd4 [BOLT] Add an option for constant island cloning (#165778)
Avoid cloning constant island helps to reduce app size, especially for
BOLT optimization in which cloning would happen when a function is split
into multiple fragments. Add an option to make the cloning optional, and
we will introduce a new pass to handle the reference too far error that
may result from disabling constant island cloning (#165787).
2025-11-03 14:44:05 -08:00
Maksim Panchenko
97660c1094 [BOLT] Issue error on unclaimed PC-relative relocation (#166098)
Replace assert with an error and improve the report when unclaimed
PC-relative relocation is left in strict mode.
2025-11-03 09:19:33 -08:00
Maksim Panchenko
7c01a90545 [BOLT] Refactor handling of branch targets. NFCI (#165828)
Refactor code that verifies external branch destinations and creates
secondary entry points.
2025-10-31 08:56:30 -07:00
YongKang Zhu
e1ae126401 [BOLT][AArch64] Validate code padding (#164037)
Check whether AArch64 function code padding is valid,
and add an option to treat invalid code padding as error.
2025-10-22 20:25:06 -07:00
Asher Dobrescu
2bbc4ae850 [BOLT] Check entry point address is not in constant island (#163418)
There are cases where `addEntryPointAtOffset` is called with a given
`Offset` that points to an address within a constant island. This
triggers `assert(!isInConstantIsland(EntryPointAddress)` and causes BOLT
to crash. This patch adds a check which ignores functions that would add
such entry points and warns the user.
2025-10-21 11:08:10 +01:00
Christian Clauss
0fc05aa1c6 [bolt] Fix typos discovered by codespell (#124726)
https://github.com/codespell-project/codespell
```bash
codespell bolt --skip="*.yaml,Maintainers.txt" --write-changes \
    --ignore-words-list=acount,alledges,ans,archtype,defin,iself,mis,mmaped,othere,outweight,vas
```
2025-10-14 14:45:40 +02:00
Gergely Bálint
889bfd9172 Reapply "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening" (#162353) (#162435)
Reapply "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing
binaries with pac-ret hardening (#120064)" (#162353)

This reverts commit c7d776b068.

#120064 was reverted for breaking builders.

Fix: changed the mismatched type in MarkRAStates.cpp to `auto`.

---

Original message:

OpNegateRAState is an AArch64-specific DWARF CFI used to change the value
of the RA_SIGN_STATE pseudoregister. The RA_SIGN_STATE register records
whether the current return address has been signed with PAC.

OpNegateRAState requires special handling in BOLT because its placement
depends on the function layout. Since BOLT reorders basic blocks during
optimization, these CFIs must be regenerated after layout is finalized.

This patch introduces two new passes:

- MarkRAStates (runs before optimizations): assigns a signedness annotation to each
  instruction based on OpNegateRAState CFIs in the input binary.

- InsertNegateRAStates (runs after optimizations): reads the annotations and emits
  new OpNegateRAState CFIs where RA state changes between instructions.

Design details are described in: `bolt/docs/PacRetDesign.md`.
2025-10-08 11:05:41 +02:00
Gergely Bálint
c7d776b068 Revert "[BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening" (#162353)
Reverts llvm/llvm-project#120064.

@gulfemsavrun reported that the patch broke toolchain builders.
2025-10-07 21:59:18 +02:00
Gergely Bálint
32eaf5b59c [BOLT][AArch64] Handle OpNegateRAState to enable optimizing binaries with pac-ret hardening (#120064)
OpNegateRAState is an AArch64-specific DWARF CFI used to change the value
of the RA_SIGN_STATE pseudoregister. The RA_SIGN_STATE register records
if the current return address has been signed with PAC.

OpNegateRAState requires special handling in BOLT because its placement
depends on the function layout. Since BOLT reorders basic blocks during
optimization, these CFIs must be regenerated after layout is finalized.

This patch introduces two new passes:

- MarkRAStates (runs before optimizations): assigns a signedness annotation to each
  instruction based on OpNegateRAState CFIs in the input binary.

- InsertNegateRAStates (runs after optimizations): reads the annotations and emits
  new OpNegateRAState CFIs where RA state changes between instructions.

Design details are described in: `bolt/docs/PacRetDesign.md`.
2025-10-07 10:22:14 +02:00
Jan Svoboda
f122484b99 [llvm][support] Move make_absolute from sys::fs to sys::path (#161459)
The `llvm::sys::fs::make_absolute(const Twine &, SmallVectorImpl<char>
&)` functions doesn't perform any FS access - it only modifies the
second parameter via path/string operations. This function should live
in the `llvm::sys::path` namespace for consistency and for making it
easier to spot function calls that perform IO.
2025-10-01 14:35:17 -07:00
Jinjie Huang
e6540d20cf [BOLT][DWARF] Skip processing DWARF CUs with a DWO ID but no DWO name (#154749)
This patch tries to skip processing DWARF CUs with a DWO ID but no DWO
name, and ensure them not included in the final binary.
2025-09-23 16:31:03 +08:00
Jinjie Huang
e2455bfc10 [BOLT][DWARF] Get DWO file via relative path if the CompilationDir does not exist (#154515)
In distributed builds, the DWARF CompilationDir is often invalid,
causing BOLT to fail when locating DWO files. If the default path does
not exist, it seems better to consider the DWOName as a relative path in
this case.

The implementation of this patch will try to search for the DWO file in
the following order:
1. CompDirOverride + DWOName (if CompDirOverride specified)
2. CompilationDir + DWOName (if CompilationDir exists)
3. **Current directory + DWOName (relative path as a fallback)**

This patch also fixes a crash that occurs when DWOName is an absolute path and a DWP file is provided.
2025-09-15 11:30:54 +08:00
Haibo Jiang
381e1bb461 [BOLT] fix print-mem-data not working (#156332)
This option `print-mem-data` is currently not working, use this fix to
restore its functionality.
2025-09-12 10:10:28 +01:00
Grigory Pastukhov
8c0f3b6e8f [BOLT] Fix debug line emission for functions in multiple compilation units (#151230)
This patch fixes a bug in BOLT's debug line emission where functions
that belong to multiple compilation units (such as inline functions in
header files) were not handled correctly. Previously, BOLT incorrectly
assumed that a binary function could belong to only one compilation
unit, leading to incomplete or incorrect debug line information.

### **Problem**

When a function appears in multiple compilation units (common scenarios
include):

*   Template instantiated functions
* Inline functions defined in header files included by multiple source
files

BOLT would only emit debug line information for one compilation unit,
losing debug information for other CUs where the function was compiled.
This resulted in incomplete debugging information and could cause
debuggers to fail to set breakpoints or show incorrect source locations.

### **Root Cause**

The issue was in BOLT's assumption that each binary function maps to
exactly one compilation unit. However, when the same function (e.g., an
inline function from a header) is compiled into multiple object files,
it legitimately belongs to multiple CUs in the final binary.
2025-09-11 10:41:11 -07:00
Matt Arsenault
67823469bd MC: Add Triple overloads for more MC constructors (#157321)
Avoids more Triple->string->Triple round trip. This
is a continuation of f137c3d592
2025-09-08 03:41:22 +00:00
Grigory Pastukhov
8c8d1d45a6 [BOLT] Fix DWARF4/5 file index handling in debug info functions (#151401)
Fix incorrect file index handling that differed between DWARF4 and
DWARF5.

DWARF4 file indices start at 1, while DWARF5 starts at 0. The code was 
manually adjusting indices with `Row.File - 1`, which works for DWARF4 
but breaks DWARF5.

Replace manual indexing with `getFileNameEntry()` which abstracts away 
the DWARF version differences.

Fixed in:
- printDebugInfo()  
- addDebugFilenameToUnit()
2025-08-29 16:54:57 -07:00
Fangrui Song
34c7b7ccae MCSymbol: Remove setUndefined
The name is misleading, as setting Fragment to nullptr does not
necessarily make it undefined - common and equated symbols have
a nullptr fragment as well.
2025-08-17 15:57:27 -07:00
Maksim Panchenko
06f13f8684 [BOLT] Fix references in ignored functions in CFG state (#140678)
When we call setIgnored() on functions that already have CFG built,
these functions are not going to get emitted and we risk missing
external function references being updated.

To mitigate the potential issues, run scanExternalRefs() on such
functions to create patches/relocations.

Since scanExternalRefs() relies on function relocations, we have to
preserve relocations until the function is emitted. As a result, the
memory overhead without debug info update could reach up to 2%.
2025-06-02 12:33:54 -07:00
Maksim Panchenko
778801cc84 [BOLT] Never call fixBranches() on non-simple functions (#141112)
We should never call fixBranches() on a function with invalid CFG. E.g.,
ValidateInternalCalls modifies CFG for its internal analysis purposes.
At the same time, it marks the function as non-simple with an assumption
that fixBranches() will never run on that function.

However, calculateEmittedSize() by default calls fixBranches() which can
lead to all sorts of issues, including assertions firing in
fixBranches().

The fix is to use the original size for non-simple functions in
calculateEmittedSize() since we are supposed to emit the function
unmodified. Additionally, add an assertion at the start of
fixBranches().
2025-05-22 14:01:54 -07:00
Kazu Hirata
0641ca1cd2 [BOLT] Avoid creating a temporary instance of std::string (NFC) (#140987)
lookupTarget takes StringRef and internally creates an instance of
std::string with the StringRef as part of constructing Triple, so we
don't need to create a temporary instance of std::string on our own.
2025-05-21 20:32:40 -07:00
Kazu Hirata
7940b0546b [BOLT] Fix warning
This patch fixes:

  bolt/lib/Core/BinaryContext.cpp:582:8: error: unused variable
  'printEntryDiagnostics' [-Werror,-Wunused-variable]

  bolt/lib/Core/BinaryContext.cpp:842:10: error: unused variable
  'isSibling' [-Werror,-Wunused-variable]
2025-04-12 23:35:49 -07:00
Amir Ayupov
ba93fe97c2 [BOLT][NFC] Simplify getOrCreate/analyze/populate/emitJumpTable (#132108) 2025-04-10 21:17:04 -07:00
Maksim Panchenko
96e5ee23a7 [BOLT][AArch64] Add partial support for lite mode (#133014)
In lite mode, we only emit code for a subset of functions while
preserving the original code in .bolt.org.text. This requires updating
code references in non-emitted functions to ensure that:

* Non-optimized versions of the optimized code never execute.
* Function pointer comparison semantics is preserved.

On x86-64, we can update code references in-place using "pending
relocations" added in scanExternalRefs(). However, on AArch64, this is
not always possible due to address range limitations and linker address
"relaxation".

There are two types of code-to-code references: control transfer (e.g.,
calls and branches) and function pointer materialization.
AArch64-specific control transfer instructions are covered by #116964.

For function pointer materialization, simply changing the immediate
field of an instruction is not always sufficient. In some cases, we need
to modify a pair of instructions, such as undoing linker relaxation and
converting NOP+ADR into ADRP+ADD sequence.

To achieve this, we use the instruction patch mechanism instead of
pending relocations. Instruction patches are emitted via the regular MC
layer, just like regular functions. However, they have a fixed address
and do not have an associated symbol table entry. This allows us to make
more complex changes to the code, ensuring that function pointers are
correctly updated. Such mechanism should also be portable to RISC-V and
other architectures.

To summarize, for AArch64, we extend the scanExternalRefs() process to
undo linker relaxation and use instruction patches to partially
overwrite unoptimized code.
2025-03-27 21:33:25 -07:00
Paschalis Mpeis
2f9d94981c [BOLT] Change Relocation Type to 32-bit NFCI (#130792) 2025-03-14 18:15:59 +00:00
chrisPyr
038fff3f24 [NFC][BOLT] Make file-local cl::opt global variables static (#126472)
#125983
2025-03-05 22:11:05 -08:00
Maksim Panchenko
5a11912ece [BOLT] Refactor interface for creating instruction patches. NFCI (#129404)
Add BinaryContext::createInstructionPatch() interface for patching parts
of the original binary with new instruction sequences. Refactor
PatchEntries pass to use the new interface.
2025-03-01 19:20:17 -08:00
Amir Ayupov
b884be8640 [BOLT] Exit with error code on missing DWO CU (#125976)
If BOLT fails to locate DWO CU when using split DWARF, this signifies an
issue with the input (missing .dwo) rather than an internal assertion.
2025-02-06 10:01:12 -08:00
Maksim Panchenko
137c3781e6 [BOLT][AArch64] Include constant islands in disassembly (#125961)
When printing disassembly of a function with constant islands, include
the island info in the dump.

At the moment, only print islands in pre-CFG state. Include islands that
are interleaved with instructions.
2025-02-05 22:41:40 -08:00
Franklin
6e8a1a45a7 [BOLT] Detect Linux kernel version if the binary is a Linux kernel (#119088)
This makes it easier to handle differences (e.g. of exception table
entry size) between versions of Linux kernel
2024-12-26 09:54:23 -08:00
Kristof Beyls
4111841f88 [BOLT] Correctly print preferred disassembly for annotated instructions (#120564)
This patch makes sure that `BinaryContext::printInstruction` prints the
preferred disassembly. Preferred disassembly only gets printed when
there are no annotations on the MCInst. Therefore, this patch
temporarily removes the annotations before printing it.

A few examples of before and after on AArch64 instructions are as
follows:

```
  BEFORE                     AFTER
                             (preferred disassembly)

  ret   x30                  ret
  orr   x30, xzr, x0         mov   x30, x0
  hint  #29                  autiasp
  hint  #12                  autia1716
```

Clearly, the preferred disassembly is easier for developers to read, and
is the disassembly that tools should be printing.

This patch is motivated as part of future work on the
llvm-bolt-binary-analysis tool, making sure that the reports it prints
do use preferred disassembly.

This patch was cherry-picked from
https://github.com/kbeyls/llvm-project/tree/bolt-gadget-scanner-prototype.

In this current patch, this only affects existing RISCV test cases.

This patch also does improve test cases in future patches that will
introduce a binary analysis for llvm-bolt-binary-analysis that checks
for correct application of pac-ret (pointer authentication on return
addresses).
2024-12-20 08:54:07 +00:00
Jared Wyles
2ccf7ed277 [JITLink] Switch to SymbolStringPtr for Symbol names (#115796)
Use SymbolStringPtr for Symbol names in LinkGraph. This reduces string interning
on the boundary between JITLink and ORC, and allows pointer comparisons (rather
than string comparisons) between Symbol names. This should improve the
performance and readability of code that bridges between JITLink and ORC (e.g.
ObjectLinkingLayer and ObjectLinkingLayer::Plugins).

To enable use of SymbolStringPtr a std::shared_ptr<SymbolStringPool> is added to
LinkGraph and threaded through to its construction sites in LLVM and Bolt. All
LinkGraphs that are to have symbol names compared by pointer equality must point
to the same SymbolStringPool instance, which in ORC sessions should be the pool
attached to the ExecutionSession.
---------

Co-authored-by: Lang Hames <lhames@gmail.com>
2024-12-06 10:22:09 +11:00