Files
Wenju He 7b94b9ae13 [libclc] Refine generic __clc_get_sub_group_size with fast full sub-group path (#188895)
Add a fast path for the common case that total work-group size is
multiple of max sub-group size.

The fallback path is ported from amdgpu/workitem/clc_get_sub_group_size.cl.

Compiler can generate predicated instructions for the fallback path to
avoid branches.
2026-04-13 08:16:06 +08:00
..