Files
Joseph Huber 7994ff27b9 [AMDGPU] Add clang builtin for generic AMDGPU shuffle (#185302)
Summary:
AMDGPU introduced a high level intrinsic for shuffles. The main
advantage of this over the ds_bpermute path is that it is correctly
lowered for w32 / w64 and doesn't require the four byte offset. This PR
adds '__builtin_amdgcn_wave_shuffle' to access it.
2026-04-20 08:33:56 -05:00

103 KiB