RVV does not have an instruction for performing a horizontal multiply reduction (either integer or floating point). However, a user of clang can explicitly write at least the integer form via the __builtin_reduce_mul construct, and currently we just crash when compiling this. This change converts the crash into functionally correct scalar loop to process each element one by one at runtime. This will be slow, but at least correct. Note that to my knowledge we can't generate the floating point one directly from C, but I decided to handle both for completeness while I was here. Written by Claude Code with guidance and review by me.
7.8 KiB
7.8 KiB