Use concrete `I32` (default) and `I64` (clock64, globaltimer) instead of generic `LLVM_Type` for special-register op results. The dialect verifier now rejects mismatches up-front, and the Python op-binding generator emits the inferred-result form, so callers can write `nvvm.ThreadIdXOp()` with no arguments. Strict tightening: no valid existing IR is rejected.