Device libs has a fast reciprocal macro that is close to the fast division expansion, but skips the last terms compared to the full division. The basic reciprocal handling has identical output to this macro. The negative reciprocal case has different fneg placement and smaller code size, but I believe should be the same.
307 KiB
307 KiB