In FoldReinterpretLoadFromConst, ReadDataFromGlobal bails out when
BytesLoaded exceeds 32 bytes. This prevent folding in our downstream
OpenCL case where global constant is [16 x float] array and is loaded as
float16 vector, which is 64 bytes.
This PR increases BytesLoaded cap to 128 bytes, to accommodate large
vector support, e.g. double16 type in OpenCL. For scalar integer load,
the limit remains 32 bytes to avoid regression on load from string
literal.
---------
Co-authored-by: Nikita Popov <github@npopov.com>