Files
Roman-Pevnyi 9de84638b9 Extending UniformQuantizedType with interface-based support for new storage types in Quant dialect (#152966)
Currently, UniformQuantizedType only supports built-in MLIR storage
types such as Integer. LLM quantization research introducing feature of
using NF4 as a low precision datatype (see
https://arxiv.org/pdf/2305.14314). There is a growing need to make the
system extensible and maintainable as more types are added. Ensuring
that MLIR can natively support NF4 through a clean, extensible interface
is essential for both current and future quantization workflows.

**Current Approach and Its Limitations:**

- The present implementation relies on dynamic checks (e.g., type
switches or if-else chains) to determine the storage type and retrieve
type-specific information for legality checks.

- This approach works for a small, fixed set of types, but as the number
of supported types grows, the code becomes harder to read, maintain, and
extend.

**Proposed Interface-Based Approach:**

- Define a StorageTypeInterface that specifies the required methods any
storage type must implement to be used in UniformQuantizedType.
- Each storage type (Integer, Float8E5M2, Float8E4M3FN, and new types
like NF4) would implement this interface, encapsulating their
type-specific logic.
- When UniformQuantizedType needs to check legality or retrieve
information, it can use MLIR’s dyn_cast mechanism to check if the type
implements the interface and then call the required methods.
- This design decouples UniformQuantizedType from the specifics of each
storage type, making it easy to add new types (such as NF4) without
modifying the core logic or introducing more type checks.

**Benefits:**

- Extensibility: New storage types can be added by simply implementing
the interface, without touching the core UniformQuantizedType logic.
- Readability: The code is cleaner, as it avoids large switch statements
or if-else chains.
- Maintainability: Type-specific logic is encapsulated within each type,
reducing the risk of errors and making the codebase easier to understand
and update.
2026-02-10 09:56:24 +00:00
..