When rematerializing S_MOV_B64 or S_MOV_B64_IMM_PSEUDO and only a single 32-bit lane of the result is used at the remat point, emit S_MOV_B32 with the appropriate half of the 64-bit immediate instead. This reduces register pressure by defining a 32-bit register instead of a 64-bit pair when the other half is unused.
70 KiB
70 KiB