April 08, 2019 —
Posted by The TensorFlow MLIR Team
The TensorFlow ecosystem contains a number of compilers and optimizers that operate at multiple levels of the software and hardware stack. As a day-to-day user of TensorFlow, this multi-level stack might manifest itself as hard-to-understand compiler and runtime errors when using different kinds of hardware (GPUs, TPUs, mobile).
These components, starting from t…
It's actually more complicated than this |
In addition, there are other even more sophisticated paths, including multiple rounds of optimization within each layer, such as the Grappler framework that optimizes tensor layout and operations in TensorFlow today.
While these numerous compiler and representation implementations substantially improve performance, this heterogeneous world can cause issues for end users, such as producing confusing error messages at the boundary between these systems. Also, new hardware and software stack creators must rebuild optimization and transformation passes for each new path.
With all this in mind, we’d like to announce MLIR, or Multi-Level Intermediate Representation. This is a representation format and library of compiler utilities that sits between the model representation and low-level compilers/executors that generate hardware-specific code. With MLIR, we want to enable novel explorations in optimizing compiler design and implementation, backed by production quality components.
We expect MLIR to be of interest to many groups, including:
MLIR is, at its heart, a flexible infrastructure for modern optimizing compilers. This means it consists of a specification for intermediate representations (IR) and a code toolkit to perform transformations on that representation. (In compiler parlance, as you move from higher-level representations to lower-level representations, these transformations can be called “lowerings”, and we’ll use that term ahead.)
MLIR is highly influenced by LLVM and unabashedly reuses many great ideas from it. It has a flexible type system, and allows representing, analyzing and transforming graphs combining multiple levels of abstraction in the same compilation unit. These abstractions include TensorFlow operations, nested polyhedral loop regions, and even LLVM instructions and fixed hardware operations and types.
If you want to connect a new low-level compiler, you would create a new dialect and the lowerings between the TensorFlow Graph dialect and your dialect. This smooths the path for hardware and compiler makers. You can even target dialects at different levels in the same model; the higher-level optimizers will respect the unfamiliar parts of the IR and wait for a lower level to handle it.
For compiler researchers and framework makers, MLIR allows you to compose transformations at every level, and you can even define your own operations and abstractions in the IR — allowing you to best model the domain of problems you are trying to solve. In this way, MLIR is more of a pure compiler infrastructure than LLVM.
While MLIR acts as a compiler for ML, we also see it enabling the use of machine learning techniques within compilers as well! This is particularly important as engineers developing numerical libraries do not scale at the same rate as the diversification of ML models or hardware. The extensibility of MLIR facilitates the exploration of code lowering strategies and performing progressive lowering across abstractions.
April 08, 2019
—
Posted by The TensorFlow MLIR Team
The TensorFlow ecosystem contains a number of compilers and optimizers that operate at multiple levels of the software and hardware stack. As a day-to-day user of TensorFlow, this multi-level stack might manifest itself as hard-to-understand compiler and runtime errors when using different kinds of hardware (GPUs, TPUs, mobile).
These components, starting from t…