llvm-project

Author	SHA1	Message	Date
Amina Chabane	3232d38a59	[BOLT][AArch64] Refuse to run RegReAssign pass (#194866 ) RegReAssign hits an unreachable on AArch64 as it is a pass (conceptually) specific to X86. - Add a guard to RegReAssign for non-X86 - Update unsupported-passes.test	2026-04-30 14:37:27 +01:00
Amina Chabane	dddd0da8e6	[BOLT][AArch64] Refuse to run IndirectCallPromotion pass (#194363 ) `--icp=<value>`/`--indirect-call-promotion=<value>` results in an `UNIMPLEMENTED` crash when invoked as it is unimplemented in AArch64. - Guard IndirectCallPromotion for non-X86 - Update unsupported-passes.test with expected error	2026-04-28 18:39:15 +01:00
Fangrui Song	c8ff86259b	[CodeGen] Make AsmPrinter::MAI a reference. NFC (#194538 ) AsmPrinter::MAI is non-null. This is made more explicit after PR #194523 changed TargetMachine::getMCAsmInfo to return a reference with recent MCAsmInfo/MCTargetOptions related refactoring. Convert the member from const MCAsmInfo * to const MCAsmInfo & and update all consumers.	2026-04-28 05:27:22 +00:00
Fangrui Song	f61e1e46ff	[MC] Make MCContext::getAsmInfo return a reference. NFC (#194523 ) The MAI member is non-null. #194280 made this clearer by making the MCContext constructor take MCAsmInfo by reference. Convert getAsmInfo to return const MCAsmInfo & and the member to a reference.	2026-04-28 03:57:47 +00:00
Amina Chabane	bd7cd403db	[BOLT][AArch64] Refuse to run CMOVConversion pass (#193998 ) `--cmov-conversion` is unsupported in AArch64 as convertMoveToConditionalMove() is only overriden for X86. - Add a guard for non-X86 - Update unsupported-passes.test with expected error	2026-04-27 12:09:56 +01:00
Fangrui Song	13e98d8341	[MC] Take MCAsmInfo by reference in MCContext and TargetMachine. NFC (#194280 ) Both MCContext::MCContext and TargetMachine::getMCAsmInfo treat MCAsmInfo as a pointer that must be non-null. Make the contract explicit: * MCContext's constructor takes `const MCAsmInfo &MAI`. * TargetMachine::getMCAsmInfo returns `const MCAsmInfo &`. Make this change now since the MCContext ctor has recently been updated.	2026-04-27 07:48:54 +00:00
Fangrui Song	33f2036f35	[MC] Add MCTargetOptions to MCAsmInfo constructor. NFC (#194200 ) Since #180464 the canonical MCTargetOptions pointer is stored in MCAsmInfo, but it is bound after construction via `setTargetOptions` called from TargetRegistry::createMCAsmInfo. Direct constructions in unit tests can leave the pointer null, leading to a runtime assert failure. Add MCTargetOptions to every MCAsmInfo subclass constructor, store it as a reference in MCAsmInfo, and remove `setTargetOptions()`.	2026-04-26 05:52:32 +00:00
Amir Ayupov	46154fef0e	[BOLT] Support negative hex in pre-aggregated profile (#192391 ) Handle signed values in parseHexField by falling back to int64_t parsing when uint64_t fails. This allows pre-aggregated profile tools to use -1 for BR_ONLY, -2 for FT_EXTERNAL_ORIGIN, -3 for FT_EXTERNAL_RETURN. Guard the external address reset loop in parseAggregatedLBREntry to preserve sentinel values (offsets >= FT_EXTERNAL_RETURN). Add tests for -1/-2/-3 in parseHexField and T entries with -1, ffffffffffffffff, and buildid:-1 as BR_ONLY.	2026-04-25 23:07:07 +00:00
Amina Chabane	8baf33522d	[BOLT][AArch64] Refuse to run JTFootprintReduction pass (#193946 ) JTFootprintReduction results in a no-op on AArch64. This is because it emits createIJmp32Frag() which is unimplemented for AArch64 and is only overridden by x86. - Add a guard for non-x86 - Update unsupported-passes.test with expected error message	2026-04-24 14:50:02 +01:00
Amina Chabane	6a06c8bdcb	[BOLT][AArch64] Refuse to run ThreeWayBranch pass (#193252 ) On AArch64, `--three-way-branch` produces a crash as it is not implemented. This patch adds a guard and updates relevant test(s).	2026-04-23 09:00:26 +01:00
Farid Zakaria	6ef1b80fef	[BOLT] Fix null pointer dereference in DWP processing with split DWARF (#191474 ) Fix two null pointer dereferences in BOLT's DWP processing path that cause SIGSEGV in worker threads when -update-debug-sections is used with a co-located .dwp file. 1. getSliceData() in updateDebugData() dereferences the result of getContribution() without checking for null. getContribution() returns nullptr when the requested section kind (e.g. DW_SECT_LINE) is not present as a column in the DWP CU index. When BOLT processes a DWP where certain section kinds are absent from the index, every worker thread that hits this path crashes simultaneously. 2. processSplitCU() dereferences getUnitDIEbyUnit() without checking for null. If buildDWOUnit() fails for a CU, the returned DIE* is null and the dereference crashes. Crash signature from dmesg: ``` llvm-worker-: segfault at 8 ip <offset> error 4 in llvm-bolt (multiple worker threads crash at the same instruction) ``` The faulting address 0x8 corresponds to accessing the Length field (offset 8) of a null `DWARFUnitIndex::Entry::SectionContribution`. At Meta, I reproduced this building hhvm with a co-located .dwp file and the flags `update-debug-sections -debug-thread-count=80 -lite=0` with profile data. I confirmed that the unfixed BOLT crashes deterministically whereas the fixed BOLT completes successfully.	2026-04-22 13:02:57 -07:00
Hemant Kulkarni	c8b526f76b	[bolt] AArch64: Fix TLSDESC to LE relaxation by mold (#190370 ) mold linker creates relaxation stub from TLSDESC to LE, (lld makes it IE) using sequence as NOP+NOP+MOVZ+MOVK. This in itself is not an issue, when --emit-relocs is added the relocs R_AARCH64_TLSDESC_ADD_LO12 and R_AARCH64_TLSDESC_CALL are associated with useful MOVW instructions. However bolt does not check for R_AARCH64_TLSDESC_ADD_LO12 in adjustRelocation() when disassembling the file. This later triggers a bug when reloc is patched as movk is patched with S_LO12 fixup kind which is invalid. Refer to bug: https://github.com/llvm/llvm-project/issues/190366 for details.	2026-04-22 14:59:11 +01:00
Rafael Auler	11515959b5	[BOLT] Fix stream position before appendPadding in writeEHFrameHeader (#193126 ) When writeEHFrameHeader needs to allocate new space for .eh_frame_hdr (because the old section is too small), it calls appendPadding to align NextAvailableAddress. appendPadding writes zero bytes at the current stream position, but after the section write loop in rewriteFile the stream is positioned at the end of the last section written in BinarySection::operator< order — not at the file offset corresponding to NextAvailableAddress. In the common case (single loadObject call) the write order matches file offset order, so the stream happens to be in the right place. But when a runtime library adds sections via additional loadObject calls, the operator< iteration order (code-before-data) can diverge from file offset order: a runtime library code section may have a higher file offset than a runtime library data section that comes after it in the write loop. The stream then ends at a lower offset than expected, and appendPadding's zeros overwrite the beginning of the code section. Fix by seeking to the correct file offset before calling appendPadding.	2026-04-21 13:10:52 -07:00
David CARLIER	2c56a63b49	[BOLT][Passes] switch remaining Instrumentation containers to ADT. (#192525 ) Follow-up to #192289. Swap the remaining `std::unordered_set`/ `std::unordered_map` containers in `Instrumentation.cpp` for `DenseSet`/ `DenseMap`: the `BBToSkip` param and `Visited` local in `hasAArch64ExclusiveMemop`, and `BBToSkip`, `BBToID`, `VisitedSet` in `instrumentFunction`. Drop the now-unused `<unordered_set>` include. The swap removes per-element heap allocations on the hot path, stops inserting empty buckets on probes where a miss is possible, and replaces hashed-bucket traversal over node-based storage with lookups over inline `DenseMap` storage. `BBToID` reads keep `operator[]` since the map is pre-populated for every basic block of the function, so no default-construct path is ever taken. NFC. Measured on `llvm-bolt -instrument` against a relocations-linked clang-23: -1.3% instrumentation-pass wall time, peak RSS unchanged (dominated by instrumentation output size).	2026-04-16 21:44:59 +01:00
Farid Zakaria	ec1e3aef9a	[BOLT] Update LSDA encoding for x86-64 large code model (#190685 ) BOLT hardcoded 4-byte LSDA (exception table) encoding for x86-64. This is insufficient for large code model binaries where functions in .ltext sections may be placed at addresses above 2GB, exceeding the range of DW_EH_PE_udata4/DW_EH_PE_sdata4 encodings. Detect large code model by checking for .ltext sections (SHF_X86_64_LARGE) and update LSDAEncoding to use 8-byte pointers: - Non-PIC: DW_EH_PE_absptr (8-byte absolute) - PIC: DW_EH_PE_pcrel \| DW_EH_PE_sdata8 (8-byte PC-relative) This was pulled out from https://github.com/llvm/llvm-project/pull/190637	2026-04-16 00:34:08 -07:00
David CARLIER	5b979f51e3	[BOLT][Passes] use ADT containers for instrumentation spanning tree. (#192289 ) Swap `std::unordered_map<…, std::set<…>>` for `DenseMap<…, SmallVector<…>>` in `Instrumentation::instrumentFunction` and switch read paths from `STOutSet[&BB]` to `find()`. This removes per-set heap allocations, stops inserting empty buckets on every probe, and replaces linear `is_contained()` scans over a red-black tree with linear scans over inline `SmallVector` storage (most basic blocks have at most a couple of spanning-tree out-edges). NFC.	2026-04-15 23:16:57 +01:00
Sergei Barannikov	f4e1a51d10	[bolt] Remove unused argument of DataExtractor constructor (NFC) (#191841 ) `AddressSize` parameter is not used by `DataExtractor` and will be removed in the future. See #190519 for more context. I took the liberty of switching from using the `StringRef` constructor overload to `ArrayRef` where appropriate.	2026-04-14 08:13:54 +03:00
Sergei Barannikov	b6ff43f1ec	[Support] Remove address-extraction methods from DataExtractor (NFC) (#190519 ) Most clients don't have a notion of "address" and pass arbitrary values (including `0` and `sizeof(void *)`) to `DataExtractor` constructors. This makes address-extraction methods dangerous to use. Those clients that do have a notion of address can use other methods like `getUnsigned()` to extract an address, or they can derive from `DataExtractor` and add convenience methods if extracting an address is routine. `DWARFDataExtractor` is an example, where the removed methods were actually moved. This does not remove `AddressSize` argument of `DataExtractor` constructors yet, but makes it unused and overloads constructors in preparation for their deletion. I'll be removing uses of the to-be-deleted constructors in follow-up patches.	2026-04-13 16:44:51 +03:00
Brian Cain	8215fb02a6	[BOLT] Fix iterator bugs (#190978 ) Fix iterator misuse in four BOLT passes, caught by _GLIBCXX_DEBUG (enabled via LLVM_ENABLE_EXPENSIVE_CHECKS=ON). * AllocCombiner: combineAdjustments() erases instructions while iterating in reverse via llvm::reverse(BB), invalidating the reverse iterator. Defer erasures to after the loop using a SmallVector. * ShrinkWrapping: processDeletions() uses std::prev(BB.eraseInstruction(II)) which is undefined when II == begin(). Restructure to standard forward iteration with erase. * DataflowAnalysis: run() unconditionally dereferences BB->rbegin(), which crashes on empty basic blocks (possible after the ShrinkWrapping fix). Guard with an emptiness check. * IndirectCallPromotion: rewriteCall() dereferences the end iterator via &(IndCallBlock.end()). Replace with &IndCallBlock.back(). TailDuplication: constantAndCopyPropagate() uses std::prev(OriginalBB.eraseInstruction(Itr)) which is undefined when Itr == begin(). Restructure to standard forward iteration with erase.	2026-04-10 14:17:28 +00:00
wangjue	adb986a71c	[BOLT][RISCV] Fix the inaccurate profile data check (#189338 )	2026-04-10 08:47:41 +03:00
Amir Ayupov	10353899af	[BOLT] Use identify_magic for shared library detection (#190902 ) Replace the fragile filename-based check (ends_with(".so")) with identify_magic()/file_magic::elf_shared_object to reliably detect shared libraries when filtering pre-aggregated profile data by build ID. Test Plan: pre-aggregated-perf-shlib.test	2026-04-09 15:48:37 -07:00
Brian Cain	5fe235b986	[BOLT] Fix strict weak ordering in getCodeSections comparator (#190905 ) The compareSections lambda in getCodeSections() violates the strict weak ordering requirement: when A == B, the comparator can return true (e.g. via the HotText mover name check), which triggers a _GLIBCXX_DEBUG assertion on self-comparison. Add an early identity check to satisfy irreflexivity.	2026-04-08 08:56:21 -05:00
Fangrui Song	1578bc684e	[MC] Move MCTargetOptions pointer from MCContext to MCAsmInfo (#180464 ) Except MC-internal `MCAsmInfo()` uses, MCAsmInfo is always constructed with `const MCTargetOptions &` via `TargetRegistry::createMCAsmInfo` (https://reviews.llvm.org/D41349). Store the pointer in MCAsmInfo and change `MCContext::getTargetOptions()` to retrieve it from there, removing the `MCTargetOptions const *TargetOptions` member from MCContext. MCContext's constructor still accepts an MCTargetOptions parameter for now but is often omitted by call sites. A subsequent change will remove this parameter and update all callers.	2026-04-08 04:35:58 +00:00
Shanzhi Chen	c7c902574c	[BOLT][AArch64] Optimize the mov-imm-to-reg operation (#189304 ) On AArch64, logical immediate instructions are used to encode some special immediate values. And even at `-O0` level, the AArch64 backend would not choose to generate 4 instructions (movz, movk, movk, movk) for moving such a special value to a 64-bit regiter. For example, to move the 64-bit value `0x0001000100010001` to `x0`, the AArch64 backend would not choose a 4-instruction-sequence like ``` movz x0, 0x0001 movk x0, 0x0001, lsl 16 movk x0, 0x0001, lsl 32 movk x0, 0x0001, lsl 48 ``` Actually, the AArch64 backend would choose to generate one instruction ``` mov x0, 0x0001000100010001 ``` which is essentially ``` orr x1, xzr, 0x0001000100010001 ``` We could refer to `AArch64ExpandPseudoImpl::expandMOVImm` and `AArch64_IMM::expandMOVImm` for related implementation. Therefore, maybe we could consider to leverage `expandMOVImm` in llvm to optimize the mov-imm-to-reg operation in BOLT, which would help to speed up the BOLT-instrumented binary.	2026-04-07 12:28:36 -07:00
Amir Ayupov	a8cf1a0352	[BOLT] Allow empty buildid in pre-aggregated profile addresses (#190675 ) Allow `parseString()` to return an empty `StringRef` when the delimiter appears at position 0. This enables parsing pre-aggregated profile addresses with an omitted buildid but preserved colon (`:addr` format), where the empty buildid corresponds to the main binary. Previously, `parseString()` rejected zero-length fields by treating `StringEnd == 0` the same as `StringRef::npos` (delimiter not found). These are distinct situations: `npos` means no delimiter exists, while `0` means the field before the delimiter is empty. The fix removes the `StringEnd == 0` sub-condition so only the missing-delimiter case errors. The existing test for buildid-prefixed addresses is extended to also verify that `:addr` input produces identical output to the plain-address and non-empty-buildid variants. Test Plan: Added empty-buildid input file and extended `pre-aggregated-perf-buildid.test` to run perf2bolt with `:addr` format and diff the fdata output against the existing buildid-prefixed result.	2026-04-06 14:41:21 -07:00
Yashwant Singh	5e14916fa6	Early exit llvm-bolt when coming across empty data files (#176859 ) perf2bolt generates empty fdata files for small binaries and right now BOLT does this check while parsing by calling `((!hasBranchData() && !hasMemData()))`. Instead, early exit as soon as the buffer finishes reading the data file and exit with error message.	2026-04-06 09:37:05 +05:30
Brian Cain	98ced6cfd0	[BOLT] Template patchELFPHDRTable and rewriteNoteSections for ELF32 (#189715 ) Template patchELFPHDRTable, rewriteNoteSections, markGnuRelroSections, and discoverStorage to support both ELF32LE and ELF64LE binaries. Previously these functions were hardcoded for ELF64LE, causing crashes when processing 32-bit ELF binaries. The RewriteInstance constructor now accepts ELF32LE objects in addition to ELF64LE. The ELF_FUNCTION macro is reused (and moved earlier in the header) to dispatch to the correct template instantiation. These changes are preparation for adding support to hexagon architecture in Bolt.	2026-04-03 15:16:31 -05:00
Rafael Auler	7da3a66c06	[BOLT] Check for write errors before keeping output file (#190359 ) Summary: When the disk runs out of space during output file writing, BOLT would crash with SIGSEGV/SIGABRT because raw_fd_ostream silently records write errors and only reports them via abort() in its destructor. This made it difficult to distinguish real BOLT bugs from infrastructure issues in production monitoring. Add an explicit error check on the output stream before calling Out->keep(), so BOLT exits cleanly with exit code 1 and a clear error message instead. Test: manually verified with a full filesystem that BOLT now prints "BOLT-ERROR: failed to write output file: No space left on device" and exits with code 1.	2026-04-03 10:02:36 -07:00
Alexandros Lamprineas	64b728128d	[BOLT][AArch64] Add minimal support for liveness analysis. (#183298 ) In this patch I am adding the missing target hooks required for the liveness analysis to run on AArch64. These are - getFlagsReg() - getRegsUsedAsParams() - getDefaultLiveOut() - getGPRegs() - isCleanRegXOR() I am also introducing the following API in LivenessAnalysis - BitVector getLiveIn/Out(const MCInst &) - MCPhysReg scavengeRegFromState(BitVector &) My intention is to allow the LongJmp pass scavenge usable registers when injecting code.	2026-04-02 11:59:59 +01:00
wangjue	8c2feea2f7	[BOLT] Delete unnecessary instructions (#189297 )	2026-04-02 06:48:38 +03:00
Alexandros Lamprineas	abc0674f83	[BOLT][AArch64] Handle irreversible branches in compact-code-model (#186850 ) When the compact-code-model is used, LongJmpPass::relaxLocalBranches attempts to reverseBranchCondition without calling isReversibleBranch resulting in runtime error. With this patch I am adding an additional trampoline to handle irreversible FEAT_CMPBR branches. In the future the plan is to use liveness analysis and replace the irreversible branch with compare followed by branch (see #185731) as long as the condition flags are dead, or emit the additional trampoline otherwise.	2026-03-27 13:41:58 +00:00
Amir Ayupov	2fafeb0509	[BOLT] Support buildid in pre-aggregated profile (#186931 ) Sample addresses belonging to external DSOs (buildid doesn't match the current file) are treated as external (0). Buildid for the main binary is expected to be omitted. Test Plan: added pre-aggregated-perf-buildid.test	2026-03-24 15:15:08 -07:00
Amir Ayupov	2e247a1d54	Revert "[BOLT] Support buildid in pre-aggregated profile" Accidentally pushed unreviewed version. This reverts commit `fce6895804`.	2026-03-24 15:13:14 -07:00
Amir Ayupov	fce6895804	[BOLT] Support buildid in pre-aggregated profile Sample addresses belonging to external DSOs (buildid doesn't match the current file) are treated as external (0). Buildid for the main binary is expected to be omitted. Test Plan: added pre-aggregated-perf-buildid.test Reviewers: paschalis-mpeis, maksfb, yavtuk, ayermolo, yozhu, rafaelauler, yota9 Reviewed By: paschalis-mpeis Pull Request: https://github.com/llvm/llvm-project/pull/186931	2026-03-24 15:05:33 -07:00
Fangrui Song	d1b9b4c548	[MC] Remove unused NoExecStack parameter from MCStreamer::initSections. NFC (#188184 ) Unused after commit `34bc5d580b`	2026-03-24 07:42:09 +00:00
Ádám Kallai	733bc3409b	[BOLT][Perf2bolt] Add support to generate pre-parsed perf data (#171144 ) Adding a generator into Perf2bolt is the initial step to support the large end-to-end tests for Arm SPE. This functionality proves unified format of pre-parsed profile that Perf2bolt is able to consume. Why does the test need to have a textual format SPE profile? * To collect an Arm SPE profile by Linux Perf, it needs to have an arm developer device which has SPE support. * To decode SPE data, it also needs to have the proper version of Linux Perf. * The minimum required version of Linux Perf is v6.15. Bypassing these technical difficulties, that easier to prove a pre-generated textual profile format. The generator relies on the aggregator work to spawn the required perf-script jobs based on the the aggregation type, and merges the results of the pref-script jobs into a single file. This hybrid profile will contain all required events such as BuildID, MMAP, TASK, BRSTACK, or MEM event for the aggregation. Two examples below how to generate a pre-parsed perf data as an input for ARM SPE aggregation: `perf2bolt -p perf.data BINARY -o perf.text --spe --generate-perf-script` Or for basic aggregation: `perf2bolt -p perf.data BINARY -o perf.text --ba --generate-perf-script`	2026-03-23 12:03:52 +01:00
Shanzhi Chen	de514fbaba	[BOLT] Remove some unused code (NFC) (#183880 ) Remove some unused code in BOLT: - `RewriteInstance::linkRuntime` is declared but not defined - `BranchContext` typedef is never used - `FuncBranchData::getBranch` is defined but never used - `FuncBranchData::getDirectCallBranch` is defined but never used	2026-03-23 09:13:00 +00:00
YongKang Zhu	b7d97d9e8d	[BOLT] Remove outdated assertion from local symtab update logic (#187409 ) The assert condition (function is not split or split into less than three fragments) is not always true now that we will emit more local symbols due to #184074.	2026-03-21 13:15:49 -07:00
Vasily Leonenko	51fd033521	[BOLT] Enable compatibility of instrumentation-file-append-pid with instrumentation-sleep-time (#183919 ) This commit enables compatibility of instrumentation-file-append-pid and instrumentation-sleep-time options. It also requires keeping the counters mapping between the watcher process and the instrumented binary process in shared mode. This is useful when we instrument a shared library that is used by several tasks running on the target system. In case when we cannot wait for every task to complete, we must use the sleep-time option. Without append-pid option, we would overwrite the profile at the same path but collected from different tasks, leading to unexpected or suboptimal optimization effects. Co-authored-by: Vasily Leonenko <vasily.leonenko@huawei.com>	2026-03-18 09:14:03 +03:00
YongKang Zhu	037c2095e6	Add hybrid function ordering support (#186003 ) Allow `--function-order` to be combined with `--reorder-functions` algorithms. Functions listed in the order file are pinned first (indices 0..N-1), then the selected algorithm orders remaining functions starting at index N.	2026-03-17 11:12:54 -07:00
Anatoly Trosinenko	481da949a4	[BOLT] Gadget scanner: implement finer-grained --scanners=... argument (#176135 ) Add separate options to enable each of the available gadget detectors. Furthermore, add two meta-options enabling all PtrAuth scanners and all available scanners of any type (which is only PtrAuth for now, though). This commit renames `pacret` option to `ptrauth-pac-ret` and `pauth` to `ptrauth-all`.	2026-03-13 15:03:25 +00:00
Ádám Kallai	fd225e296f	[BOLT] Spawn buildid-list perf job at perf2bolt start. NFC (#185865 ) Launch this perf job with the others at the beginning of the aggregation process. Extracting buildid-list from perf data is not a costly process, so it can be performed by default. This provides a distinct advantage when this dataset is required in other perf2bolt stages as well. Please see PR #171144.	2026-03-12 10:24:09 +01:00
Amina Chabane	498906f2df	[BOLT] Error out on SHF_COMPRESSED debug sections (#185662 ) Some binaries are built using `-gz=zstd`, but when using `--update-debug-sections` on said binaries BOLT crashes. This patch fixes this issue by recognising compressed debug sections in binaries via their flag `SHF_COMPRESSED` and appropriately erroring out. Legacy GNU-style compression is not handled.	2026-03-10 10:18:12 -07:00
Fangrui Song	c889454f1d	[MC] Rename PrivateGlobalPrefix to InternalSymbolPrefix. NFC (#185164 ) The "private global" terminology, likely came from llvm/lib/IR/Mangler.cpp, is misleading: "private" is the opposite of "global", and these prefixed symbols are not global in the object file format sense (e.g. ELF has STB_GLOBAL while these symbols are always STB_LOCAL). The term "internal symbol" better describes their purpose: symbols for internal use by compilers and assemblers, not meant to be visible externally. This rename is a step toward adopting the "internal symbol prefix" terminology agreed with GNU as (https://sourceware.org/pipermail/binutils/2026-March/148448.html).	2026-03-10 01:03:27 -07:00
Asher Dobrescu	7bce678ec1	[BOLT] Check if symbol is in data area of function (#160143 ) There are cases in which `getEntryIDForSymbol` is called, where the given Symbol is in a constant island, and so BOLT can not find its function. This causes BOLT to reach `llvm_unreachable("symbol not found")` and crash. This patch adds a check that avoids this crash.	2026-03-06 10:37:54 +00:00
YongKang Zhu	95685ca52e	[BOLT] Retain certain local symbols (#184074 ) BOLT currently strips all STT_NOTYPE STB_LOCAL zero-sized symbols that fall inside function bodies. Certain such symbols are named labels (loop markers and subroutine entry points) or local function symbols in hand-written assembly. We now keep them in local symbol table in BOLT processed binaries for better symbolication.	2026-03-05 00:34:36 -08:00
YongKang Zhu	14bcb1a009	[BOLT] Make sure IOAddressMap exist before lookup (NFC) (#183184 ) `BinaryFunction::translateInputToOutputAddress()` contains fallback logic in case that querying `IOAddressMap` doesn't yield an output address. Because this function could be called in scenarios where `IOAddressMap` won't be set up, we should check if the map actually exists before lookup.	2026-03-01 23:27:39 -08:00
Gergely Bálint	9d762ad279	[BOLT][BTI] Patch ignored functions in place when targeting them with indirect branches (#177165 ) When applying BTI fixups to indirect branch targets, ignored functions are considered as a special case: - these hold no instructions, - have no CFG, - and are not emitted in the new text section. The solution is to patch the entry points in the original location. If such a situation occurs in a binary, recompilation using the -fpatchable-function-entry flag is required. This will place a nop at all function starts, which BOLT can use to patch the original section. Without the extra nop, BOLT cannot safely patch the original .text section. An alternative solution could be to also ignore the function from which the stub starts. This has not been tried as LongJmp pass - where most stubs are inserted - is currently not equipped to ignore functions. Testing: both the success and failure cases are covered with lit tests.	2026-02-24 11:09:42 +01:00
Maksim Panchenko	7063b22c63	[BOLT] Always place new PT_LOAD after existing ones (#182642 ) Insert new PT_LOAD segments right after the last existing PT_LOAD in the program header table, instead of before PT_DYNAMIC or at the end. This maintains the ascending p_vaddr order required by the ELF specification. Previously, new segments could end up breaking PT_LOAD p_vaddr order when PT_LOAD segments followed PT_DYNAMIC or PT_GNU_STACK. This lead to runtime loader incorrectly assessing dynamic object size and silently corrupting memory.	2026-02-21 14:09:36 -08:00
Amir Ayupov	393adaac1d	[BOLT] Mark BOLTReserved segment executable (#181606 ) Summary: When .bolt_reserved section is defined in the linker script, there's no way to mark the containing segment executable other than via PHDRS command which overrides program headers entirely which is impractical. Since .bolt_reserved contains executable code, mark segment executable in BOLT. Test Plan: bolt-reserved.test	2026-02-19 15:07:50 -08:00

1 2 3 4 5 ...

1414 Commits