
MLIR Language Reference
MLIR (Multi-Level IR) is a compiler intermediate representation with similarities to traditional three-address SSA representations (like LLVM IR or SIL), but which introduces notions from polyhedral loop optimization as first-class concepts. This hybrid design is optimized to represent, analyze, and transform high level dataflow graphs as well as target-specific code generated for high performance data parallel systems. Beyond its representational capabilities, its single continuous design provides a framework to lower from dataflow graphs to high-performance target-specific code.
This document defines and describes the key concepts in MLIR, and is intended to be a dry reference document - the rationale documentation, glossary, and other content are hosted elsewhere.
MLIR is designed to be used in three different forms: a human-readable textual form suitable for debugging, an in-memory form suitable for programmatic transformations and analysis, and a compact serialized form suitable for storage and transport. The different forms all describe the same semantic content. This document describes the human-readable textual form.
High-Level Structure
MLIR is fundamentally based on a graph-like data structure of nodes, called Operations, and edges, called Values. Each Value is the result of exactly one Operation or Block Argument, and has a Value Type defined by the type system. Operations are contained in Blocks and Blocks are contained in Regions. Operations are also ordered within their containing block and Blocks are ordered in their containing region, although this order may or may not be semantically meaningful in a given kind of region). Operations may also contain regions, enabling hierarchical structures to be represented.
Operations can represent many different concepts, from higher-level concepts like function definitions, function calls, buffer allocations, view or slices of buffers, and process creation, to lower-level concepts like target-independent arithmetic, target-specific instructions, configuration registers, and logic gates. These different concepts are represented by different operations in MLIR and the set of operations usable in MLIR can be arbitrarily extended.
MLIR also provides an extensible framework for transformations on operations, using familiar concepts of compiler Passes. Enabling an arbitrary set of passes on an arbitrary set of operations results in a significant scaling challenge, since each transformation must potentially take into account the semantics of any operation. MLIR addresses this complexity by allowing operation semantics to be described abstractly using Traits and Interfaces, enabling transformations to operate on operations more generically. Traits often describe verification constraints on valid IR, enabling complex invariants to be captured and checked. (see Op vs Operation)
One obvious application of MLIR is to represent an SSA-based IR, like the LLVM core IR, with appropriate choice of operation types to define Modules, Functions, Branches, Memory Allocation, and verification constraints to ensure the SSA Dominance property. MLIR includes a collection of dialects which defines just such structures. However, MLIR is intended to be general enough to represent other compiler-like data structures, such as Abstract Syntax Trees in a language frontend, generated instructions in a target-specific backend, or circuits in a High-Level Synthesis tool.
Here’s an example of an MLIR module:
// Compute A*B using an implementation of multiply kernel and print the // result using a TensorFlow op. The dimensions of A and B are partially // known. The shapes are assumed to match. func.func @mul(%A: tensor<100x?xf32>, %B: tensor<?x50xf32>) -> (tensor<100x50xf32>) { // Compute the inner dimension of %A using the dim operation. %n = memref.dim %A, 1 : tensor<100x?xf32> // Allocate addressable "buffers" and copy tensors %A and %B into them. %A_m = memref.alloc(%n) : memref<100x?xf32> memref.tensor_store %A to %A_m : memref<100x?xf32> %B_m = memref.alloc(%n) : memref<?x50xf32> memref.tensor_store %B to %B_m : memref<?x50xf32> // Call function @multiply passing memrefs as arguments, // and getting returned the result of the multiplication. %C_m = call @multiply(%A_m, %B_m) : (memref<100x?xf32>, memref<?x50xf32>) -> (memref<100x50xf32>) memref.dealloc %A_m : memref<100x?xf32> memref.dealloc %B_m : memref<?x50xf32> // Load the buffer data into a higher level "tensor" value. %C = memref.tensor_load %C_m : memref<100x50xf32> memref.dealloc %C_m : memref<100x50xf32> // Call TensorFlow built-in function to print the result tensor. "tf.Print"(%C){message: "mul result"} : (tensor<100x50xf32>) -> (tensor<100x50xf32>) return %C : tensor<100x50xf32> } // A function that multiplies two memrefs and returns the result. func.func @multiply(%A: memref<100x?xf32>, %B: memref<?x50xf32>) -> (memref<100x50xf32>) { // Compute the inner dimension of %A. %n = memref.dim %A, 1 : memref<100x?xf32> // Allocate memory for the multiplication result. %C = memref.alloc() : memref<100x50xf32> // Multiplication loop nest. affine.for %i = 0 to 100 { affine.for %j = 0 to 50 { 0 to %C[%i, %j] : memref<100x50xf32> affine.for %k = 0 to %n { %a_v = memref.load %A[%i, %k] : memref<100x?xf32> %b_v = memref.load %B[%k, %j] : memref<?x50xf32> %prod = arith.mulf %a_v, %b_v : f32 %c_v = memref.load %C[%i, %j] : memref<100x50xf32> %sum = arith.addf %c_v, %prod : f32 %sum, %C[%i, %j] : memref<100x50xf32> } } } return %C : memref<100x50xf32> }
MLIR has a simple and unambiguous grammar, allowing it to reliably round-trip through a textual form. This is important for development of the compiler - e.g. for understanding the state of code as it is being transformed and writing test cases.
This document describes the grammar using Extended Backus-Naur Form (EBNF).
This is the EBNF grammar used in this document, presented in yellow boxes.
alternation ::= expr0 | expr1 | expr2 // Either expr0 or expr1 or expr2. sequence ::= expr0 expr1 expr2 // Sequence of expr0 expr1 expr2. repetition0 ::= expr* // 0 or more occurrences. repetition1 ::= expr+ // 1 or more occurrences. optionality ::= expr? // 0 or 1 occurrence. grouping ::= (expr) // Everything inside parens is grouped together. literal ::= `abcd` // Matches the literal `abcd`.
Code examples are presented in blue boxes.
// This is an example use of the grammar above: // This matches things like: ba, bana, boma, banana, banoma, bomana... example ::= `b` (`an` | `om`)* `a`
Common syntax
The following core grammar productions are used in this document:
// TODO: Clarify the split between lexing (tokens) and parsing (grammar). digit ::= [0-9] hex_digit ::= [0-9a-fA-F] letter ::= [a-zA-Z] id-punct ::= [$._-] integer-literal ::= decimal-literal | hexadecimal-literal decimal-literal ::= digit+ hexadecimal-literal ::= `0x` hex_digit+ float-literal ::= [-+]?[0-9]+[.][0-9]*([eE][-+]?[0-9]+)? string-literal ::= `"` [^"\n\f\v\r]* `"` TODO: define escaping rules
Not listed here, but MLIR does support comments. They use standard BCPL syntax, starting with a // and going until the end of the line.
Top level Productions
// Top level production toplevel := (operation | attribute-alias-def | type-alias-def)*
The production toplevel is the top level production that is parsed by any parsing consuming the MLIR syntax. Operations, Attribute aliases, and Type aliases can be declared on the toplevel.
Identifiers and keywords
// Identifiers bare-id ::= (letter|[_]) (letter|digit|[_$.])* bare-id-list ::= bare-id (`,` bare-id)* value-id ::= `%` suffix-id alias-name :: = bare-id suffix-id ::= (digit+ | ((letter|id-punct) (letter|id-punct|digit)*)) symbol-ref-id ::= `@` (suffix-id | string-literal) (`::` symbol-ref-id)? value-id-list ::= value-id (`,` value-id)* // Uses of value, e.g. in an operand list to an operation. value-use ::= value-id value-use-list ::= value-use (`,` value-use)*
Identifiers name entities such as values, types and functions, and are chosen by the writer of MLIR code. Identifiers may be descriptive (e.g. %batch_size, @matmul), or may be non-descriptive when they are auto-generated (e.g. %23, @func42). Identifier names for values may be used in an MLIR text file but are not persisted as part of the IR - the printer will give them anonymous names like %42.
MLIR guarantees identifiers never collide with keywords by prefixing identifiers with a sigil (e.g. %, #, @, ^, !). In certain unambiguous contexts (e.g. affine expressions), identifiers are not prefixed, for brevity. New keywords may be added to future versions of MLIR without danger of collision with existing identifiers.
Value identifiers are only in scope for the (nested) region in which they are defined and cannot be accessed or referenced outside of that region. Argument identifiers in mapping functions are in scope for the mapping body. Particular operations may further limit which identifiers are in scope in their regions. For instance, the scope of values in a region with SSA control flow semantics is constrained according to the standard definition of SSA dominance. Another example is the IsolatedFromAbove trait, which restricts directly accessing values defined in containing regions.
Function identifiers and mapping identifiers are associated with Symbols and have scoping rules dependent on symbol attributes.
Dialects are the mechanism by which to engage with and extend the MLIR ecosystem. They allow for defining new operations, as well as attributes and types. Each dialect is given a unique namespace that is prefixed to each defined attribute/operation/type. For example, the Affine dialect defines the namespace: affine.
MLIR allows for multiple dialects, even those outside of the main tree, to co-exist together within one module. Dialects are produced and consumed by certain passes. MLIR provides a framework to convert between, and within, different dialects.
A few of the dialects supported by MLIR:
Target specific operations
Dialects provide a modular way in which targets can expose target-specific operations directly through to MLIR. As an example, some targets go through LLVM. LLVM has a rich set of intrinsics for certain target-independent operations (e.g. addition with overflow check) as well as providing access to target-specific operations for the targets it supports (e.g. vector permutation operations). LLVM intrinsics in MLIR are represented via operations that start with an “llvm.” name.
// LLVM: %x = call {i16, i1} @llvm.sadd.with.overflow.i16(i16 %a, i16 %b) %x:2 = "llvm.sadd.with.overflow.i16"(%a, %b) : (i16, i16) -> (i16, i1)
These operations only work when targeting LLVM as a backend (e.g. for CPUs and GPUs), and are required to align with the LLVM definition of these intrinsics.
operation ::= op-result-list? (generic-operation | custom-operation) trailing-location? generic-operation ::= string-literal `(` value-use-list? `)` successor-list? region-list? dictionary-attribute? `:` function-type custom-operation ::= bare-id custom-operation-format op-result-list ::= op-result (`,` op-result)* `=` op-result ::= value-id (`:` integer-literal) successor-list ::= `[` successor (`,` successor)* `]` successor ::= caret-id (`:` block-arg-list)? region-list ::= `(` region (`,` region)* `)` dictionary-attribute ::= `{` (attribute-entry (`,` attribute-entry)*)? `}` trailing-location ::= (`loc` `(` location `)`)?
MLIR introduces a uniform concept called operations to enable describing many different levels of abstractions and computations. Operations in MLIR are fully extensible (there is no fixed list of operations) and have application-specific semantics. For example, MLIR supports target-independent operations, affine operations, and target-specific machine operations.
The internal representation of an operation is simple: an operation is identified by a unique string (e.g. dim, tf.Conv2d, x86.repmovsb, ppc.eieio, etc), can return zero or more results, take zero or more operands, has a dictionary of attributes, has zero or more successors, and zero or more enclosed regions. The generic printing form includes all these elements literally, with a function type to indicate the types of the results and operands.
// An operation that produces two results. // The results of %result can be accessed via the <name> `#` <opNo> syntax. %result:2 = "foo_div"() : () -> (f32, i32) // Pretty form that defines a unique name for each result. %foo, %bar = "foo_div"() : () -> (f32, i32) // Invoke a TensorFlow function called tf.scramble with two inputs // and an attribute "fruit". %2 = "tf.scramble"(%result#0, %bar) {fruit = "banana"} : (f32, i32) -> f32
In addition to the basic syntax above, dialects may register known operations. This allows those dialects to support custom assembly form for parsing and printing operations. In the operation sets listed below, we show both forms.
Builtin Operations
The builtin dialect defines a select few operations that are widely applicable by MLIR dialects, such as a universal conversion cast operation that simplifies inter/intra dialect conversion. This dialect also defines a top-level module operation, that represents a useful IR container.
block ::= block-label operation+ block-label ::= block-id block-arg-list? `:` block-id ::= caret-id caret-id ::= `^` suffix-id value-id-and-type ::= value-id `:` type // Non-empty list of names and types. value-id-and-type-list ::= value-id-and-type (`,` value-id-and-type)* block-arg-list ::= `(` value-id-and-type-list? `)`
A Block is a list of operations. In SSACFG regions, each block represents a compiler basic block where instructions inside the block are executed in order and terminator operations implement control flow branches between basic blocks.
The last operation in a block must be a terminator operation. A region with a single block may opt out of this requirement by attaching the NoTerminator on the enclosing op. The top-level ModuleOp is an example of such an operation which defines this trait and whose block body does not have a terminator.
Blocks in MLIR take a list of block arguments, notated in a function-like way. Block arguments are bound to values specified by the semantics of individual operations. Block arguments of the entry block of a region are also arguments to the region and the values bound to these arguments are determined by the semantics of the containing operation. Block arguments of other blocks are determined by the semantics of terminator operations, e.g. Branches, which have the block as a successor. In regions with control flow, MLIR leverages this structure to implicitly represent the passage of control-flow dependent values without the complex nuances of PHI nodes in traditional SSA representations. Note that values which are not control-flow dependent can be referenced directly and do not need to be passed through block arguments.
Here is a simple example function showing branches, returns, and block arguments:
func.func @simple(i64, i1) -> i64 { ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a cf.cond_br %cond, ^bb1, ^bb2 ^bb1: ^bb3(%a: i64) // Branch passes %a as the argument ^bb2: %b = arith.addi %a, %a : i64 ^bb3(%b: i64) // Branch passes %b as the argument // ^bb3 receives an argument, named %c, from predecessors // and passes it on to bb4 along with %a. %a is referenced // directly from its defining operation and is not passed through // an argument of ^bb3. ^bb3(%c: i64): ^bb4(%c, %a : i64, i64) ^bb4(%d : i64, %e : i64): %0 = arith.addi %d, %e : i64 return %0 : i64 // Return is also a terminator. }
Context: The “block argument” representation eliminates a number of special cases from the IR compared to traditional “PHI nodes are operations” SSA IRs (like LLVM). For example, the parallel copy semantics of SSA is immediately apparent, and function arguments are no longer a special case: they become arguments to the entry block [ more rationale]. Blocks are also a fundamental concept that cannot be represented by operations because values defined in an operation cannot be accessed outside the operation.
A region is an ordered list of MLIR Blocks. The semantics within a region is not imposed by the IR. Instead, the containing operation defines the semantics of the regions it contains. MLIR currently defines two kinds of regions: SSACFG regions, which describe control flow between blocks, and Graph regions, which do not require control flow between block. The kinds of regions within an operation are described using the RegionKindInterface.
Regions do not have a name or an address, only the blocks contained in a region do. Regions must be contained within operations and have no type or attributes. The first block in the region is a special block called the ‘entry block’. The arguments to the entry block are also the arguments of the region itself. The entry block cannot be listed as a successor of any other block. The syntax for a region is as follows:
region ::= `{` entry-block? block* `}` entry-block ::= operation+
A function body is an example of a region: it consists of a CFG of blocks and has additional semantic restrictions that other types of regions may not have. For example, in a function body, block terminators must either branch to a different block, or return from a function where the types of the return arguments must match the result types of the function signature. Similarly, the function arguments must match the types and count of the region arguments. In general, operations with regions can define these correspondences arbitrarily.
An entry block is a block with no label and no arguments that may occur at the beginning of a region. It enables a common pattern of using a region to open a new scope.
Value Scoping
Regions provide hierarchical encapsulation of programs: it is impossible to reference, i.e. branch to, a block which is not in the same region as the source of the reference, i.e. a terminator operation. Similarly, regions provides a natural scoping for value visibility: values defined in a region don’t escape to the enclosing region, if any. By default, operations inside a region can reference values defined outside of the region whenever it would have been legal for operands of the enclosing operation to reference those values, but this can be restricted using traits, such as OpTrait::IsolatedFromAbove, or a custom verifier.
"any_op"(%a) ({ // if %a is in-scope in the containing region... // then %a is in-scope here too. %new_value = "another_op"(%a) : (i64) -> (i64) }) : (i64) -> (i64)
MLIR defines a generalized ‘hierarchical dominance’ concept that operates across hierarchy and defines whether a value is ‘in scope’ and can be used by a particular operation. Whether a value can be used by another operation in the same region is defined by the kind of region. A value defined in a region can be used by an operation which has a parent in the same region, if and only if the parent could use the value. A value defined by an argument to a region can always be used by any operation deeply contained in the region. A value defined in a region can never be used outside of the region.
Control Flow and SSACFG Regions
In MLIR, control flow semantics of a region is indicated by RegionKind::SSACFG. Informally, these regions support semantics where operations in a region ‘execute sequentially’. Before an operation executes, its operands have well-defined values. After an operation executes, the operands have the same values and results also have well-defined values. After an operation executes, the next operation in the block executes until the operation is the terminator operation at the end of a block, in which case some other operation will execute. The determination of the next instruction to execute is the ‘passing of control flow’.
In general, when control flow is passed to an operation, MLIR does not restrict when control flow enters or exits the regions contained in that operation. However, when control flow enters a region, it always begins in the first block of the region, called the entry block. Terminator operations ending each block represent control flow by explicitly specifying the successor blocks of the block. Control flow can only pass to one of the specified successor blocks as in a branch operation, or back to the containing operation as in a return operation. Terminator operations without successors can only pass control back to the containing operation. Within these restrictions, the particular semantics of terminator operations is determined by the specific dialect operations involved. Blocks (other than the entry block) that are not listed as a successor of a terminator operation are defined to be unreachable and can be removed without affecting the semantics of the containing operation.
Although control flow always enters a region through the entry block, control flow may exit a region through any block with an appropriate terminator. The standard dialect leverages this capability to define operations with Single-Entry-Multiple-Exit (SEME) regions, possibly flowing through different blocks in the region and exiting through any block with a return operation. This behavior is similar to that of a function body in most programming languages. In addition, control flow may also not reach the end of a block or region, for example if a function call does not return.
func.func @accelerator_compute(i64, i1) -> i64 { // An SSACFG region ^bb0(%a: i64, %cond: i1): // Code dominated by ^bb0 may refer to %a cf.cond_br %cond, ^bb1, ^bb2 ^bb1: // This def for %value does not dominate ^bb2 %value = "op.convert"(%a) : (i64) -> i64 ^bb3(%a: i64) // Branch passes %a as the argument ^bb2: accelerator.launch() { // An SSACFG region ^bb0: // Region of code nested under "accelerator.launch", it can reference %a but // not %value. %new_value = "accelerator.do_something"(%a) : (i64) -> () } // %new_value cannot be referenced outside of the region ^bb3: ... }
Operations with Multiple Regions
An operation containing multiple regions also completely determines the semantics of those regions. In particular, when control flow is passed to an operation, it may transfer control flow to any contained region. When control flow exits a region and is returned to the containing operation, the containing operation may pass control flow to any region in the same operation. An operation may also pass control flow to multiple contained regions concurrently. An operation may also pass control flow into regions that were specified in other operations, in particular those that defined the values or symbols the given operation uses as in a call operation. This passage of control is generally independent of passage of control flow through the basic blocks of the containing region.
Regions allow defining an operation that creates a closure, for example by “boxing” the body of the region into a value they produce. It remains up to the operation to define its semantics. Note that if an operation triggers asynchronous execution of the region, it is under the responsibility of the operation caller to wait for the region to be executed guaranteeing that any directly used values remain live.
Graph Regions
In MLIR, graph-like semantics in a region is indicated by RegionKind::Graph. Graph regions are appropriate for concurrent semantics without control flow, or for modeling generic directed graph data structures. Graph regions are appropriate for representing cyclic relationships between coupled values where there is no fundamental order to the relationships. For instance, operations in a graph region may represent independent threads of control with values representing streams of data. As usual in MLIR, the particular semantics of a region is completely determined by its containing operation. Graph regions may only contain a single basic block (the entry block).
Rationale: Currently graph regions are arbitrarily limited to a single basic block, although there is no particular semantic reason for this limitation. This limitation has been added to make it easier to stabilize the pass infrastructure and commonly used passes for processing graph regions to properly handle feedback loops. Multi-block regions may be allowed in the future if use cases that require it arise.
In graph regions, MLIR operations naturally represent nodes, while each MLIR value represents a multi-edge connecting a single source node and multiple destination nodes. All values defined in the region as results of operations are in scope within the region and can be accessed by any other operation in the region. In graph regions, the order of operations within a block and the order of blocks in a region is not semantically meaningful and non-terminator operations may be freely reordered, for instance, by canonicalization. Other kinds of graphs, such as graphs with multiple source nodes and multiple destination nodes, can also be represented by representing graph edges as MLIR operations.
Note that cycles can occur within a single block in a graph region, or between basic blocks.
"test.graph_region"() ({ // A Graph region %1 = "op1"(%1, %3) : (i32, i32) -> (i32) // OK: %1, %3 allowed here %2 = "test.ssacfg_region"() ({ %5 = "op2"(%1, %2, %3, %4) : (i32, i32, i32, i32) -> (i32) // OK: %1, %2, %3, %4 all defined in the containing region }) : () -> (i32) %3 = "op2"(%1, %4) : (i32, i32) -> (i32) // OK: %4 allowed here %4 = "op3"(%1) : (i32) -> (i32) }) : () -> ()
Arguments and Results
The arguments of the first block of a region are treated as arguments of the region. The source of these arguments is defined by the semantics of the parent operation. They may correspond to some of the values the operation itself uses.
Regions produce a (possibly empty) list of values. The operation semantics defines the relation between the region results and the operation results.
Type System
Each value in MLIR has a type defined by the type system. MLIR has an open type system (i.e. there is no fixed list of types), and types may have application-specific semantics. MLIR dialects may define any number of types with no restrictions on the abstractions they represent.
type ::= type-alias | dialect-type | builtin-type type-list-no-parens ::= type (`,` type)* type-list-parens ::= `(` `)` | `(` type-list-no-parens `)` // This is a common way to refer to a value with a specified type. ssa-use-and-type ::= ssa-use `:` type ssa-use ::= value-use // Non-empty list of names and types. ssa-use-and-type-list ::= ssa-use-and-type (`,` ssa-use-and-type)* function-type ::= (type | type-list-parens) `->` (type | type-list-parens)
Type Aliases
type-alias-def ::= '!' alias-name '=' type type-alias ::= '!' alias-name
MLIR supports defining named aliases for types. A type alias is an identifier that can be used in the place of the type that it defines. These aliases must be defined before their uses. Alias names may not contain a ‘.’, since those names are reserved for dialect types.
!avx_m128 = vector<4 x f32> // Using the original type. "foo"(%x) : vector<4 x f32> -> () // Using the type alias. "foo"(%x) : !avx_m128 -> ()
Dialect Types
Similarly to operations, dialects may define custom extensions to the type system.
dialect-namespace ::= bare-id dialect-type ::= '!' (opaque-dialect-type | pretty-dialect-type) opaque-dialect-type ::= dialect-namespace dialect-type-body pretty-dialect-type ::= dialect-namespace '.' pretty-dialect-type-lead-ident dialect-type-body? pretty-dialect-type-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' dialect-type-body ::= '<' dialect-type-contents+ '>' dialect-type-contents ::= dialect-type-body | '(' dialect-type-contents+ ')' | '[' dialect-type-contents+ ']' | '{' dialect-type-contents+ '}' | '[^\[<({\]>)}\0]+'
Dialect types are generally specified in an opaque form, where the contents of the type are defined within a body wrapped with the dialect namespace and <>. Consider the following examples:
// A tensorflow string type. !tf<string> // A type with complex components. !foo<something<abcd>> // An even more complex type. !foo<"a123^^^" + bar>
Dialect types that are simple enough may use a prettier format, which unwraps part of the syntax into an equivalent, but lighter weight form:
// A tensorflow string type. !tf.string // A type with complex components. !foo.something<abcd>
See here to learn how to define dialect types.
Builtin Types
The builtin dialect defines a set of types that are directly usable by any other dialect in MLIR. These types cover a range from primitive integer and floating-point types, function types, and more.
attribute-entry ::= (bare-id | string-literal) `=` attribute-value attribute-value ::= attribute-alias | dialect-attribute | builtin-attribute
Attributes are the mechanism for specifying constant data on operations in places where a variable is never allowed - e.g. the comparison predicate of a cmpi operation. Each operation has an attribute dictionary, which associates a set of attribute names to attribute values. MLIR’s builtin dialect provides a rich set of builtin attribute values out of the box (such as arrays, dictionaries, strings, etc.). Additionally, dialects can define their own dialect attribute values.
The top-level attribute dictionary attached to an operation has special semantics. The attribute entries are considered to be of two different kinds based on whether their dictionary key has a dialect prefix:
  • inherent attributes are inherent to the definition of an operation’s semantics. The operation itself is expected to verify the consistency of these attributes. An example is the predicate attribute of the arith.cmpi op. These attributes must have names that do not start with a dialect prefix.
  • discardable attributes have semantics defined externally to the operation itself, but must be compatible with the operations’s semantics. These attributes must have names that start with a dialect prefix. The dialect indicated by the dialect prefix is expected to verify these attributes. An example is the gpu.container_module attribute.
Note that attribute values are allowed to themselves be dictionary attributes, but only the top-level dictionary attribute attached to the operation is subject to the classification above.
Attribute Value Aliases
attribute-alias-def ::= '#' alias-name '=' attribute-value attribute-alias ::= '#' alias-name
MLIR supports defining named aliases for attribute values. An attribute alias is an identifier that can be used in the place of the attribute that it defines. These aliases must be defined before their uses. Alias names may not contain a ‘.’, since those names are reserved for dialect attributes.
#map = affine_map<(d0) -> (d0 + 10)> // Using the original attribute. %b = affine.apply affine_map<(d0) -> (d0 + 10)> (%a) // Using the attribute alias. %b = affine.apply #map(%a)
Dialect Attribute Values
Similarly to operations, dialects may define custom attribute values.
dialect-namespace ::= bare-id dialect-attribute ::= '#' (opaque-dialect-attribute | pretty-dialect-attribute) opaque-dialect-attribute ::= dialect-namespace dialect-attribute-body pretty-dialect-attribute ::= dialect-namespace '.' pretty-dialect-attribute-lead-ident dialect-attribute-body? pretty-dialect-attribute-lead-ident ::= '[A-Za-z][A-Za-z0-9._]*' dialect-attribute-body ::= '<' dialect-attribute-contents+ '>' dialect-attribute-contents ::= dialect-attribute-body | '(' dialect-attribute-contents+ ')' | '[' dialect-attribute-contents+ ']' | '{' dialect-attribute-contents+ '}' | '[^\[<({\]>)}\0]+'
Dialect attributes are generally specified in an opaque form, where the contents of the attribute are defined within a body wrapped with the dialect namespace and <>. Consider the following examples:
// A string attribute. #foo<string<"">> // A complex attribute. #foo<"a123^^^" + bar>
Dialect attributes that are simple enough may use a prettier format, which unwraps part of the syntax into an equivalent, but lighter weight form:
// A string attribute. #foo.string<"">
See here on how to define dialect attribute values.
Builtin Attribute Values
The builtin dialect defines a set of attribute values that are directly usable by any other dialect in MLIR. These types cover a range from primitive integer and floating-point values, attribute dictionaries, dense multi-dimensional arrays, and more.
IR Versioning
A dialect can opt-in to handle versioning through the BytecodeDialectInterface. Few hooks are exposed to the dialect to allow managing a version encoded into the bytecode file. The version is loaded lazily and allows to retrieve the version information while parsing the input IR, and gives an opportunity to each dialect for which a version is present to perform IR upgrades post-parsing through the upgradeFromVersion method. Custom Attribute and Type encodings can also be upgraded according to the dialect version using readAttribute and readType methods.
There is no restriction on what kind of information a dialect is allowed to encode to model its versioning. Currently, versioning is supported only for bytecode formats.
[MLIR] Dialect及Operation详解
1. MLIR简介
MLIR 全称是 Multi-Level Intermediate Representation (多级中间表示),是一种全新的编译器框架。
1.1 IR是什么
IR即 Intermediate Representation,可以看作是一种数据格式,作为从端到端转换中的中间表示。例如深度学习模型一般表示为计算图,能够表示计算图的数据结果就可以称为一种IR,例如ONNXTorchScriptTVM Relay等等。
计算图(conputation graph)
· ONNX(Open Neural Network Exchange) : ONNX 协议首先由微软和Meta提出,它定义了一组和环境、平台均无关的标准格式(如算子功能)。在训练完成后可以将支持框架(Pytorch、Tensorflow等)的模型转化为 ONNX 文件进行存储,ONNX 文件不仅存储了神经网络模型的权重,也存储了模型的结构信息以及网络中每一层的输入输出等信息。
· TorchScrpit : PyTorch 最大的卖点是它对动态网络的支持,比其他需要构建静态网络的框架拥有更低的学习成本。但动态图模式在每次执行计算时都要重新构造计算图,非固定的网络结构给网络结构分析并进行优化带来了困难。TorchScript 就是为了解决这个问题而诞生的工具,包括代码的追踪及解析、中间表示的生成、模型优化、序列化等各种功能。
· Relay IR : 与 TVM 框架绑定,是一个函数式、可微的、静态的、针对机器学习的领域定制编程语言,解决了普通DL框架不支持 control flow 以及 dynamic shape 的特点,使用 lambda calculus 作为基准IR。
1.2 常见的IR表示系统
(1) 由于 C、C++ 源码直接转成 AST 时,并不会进行语言特定的优化,程序的优化主要集中于 LLVM IR 阶段。但 LLVM IR 表示层级较低,会丢失源码中的部分信息(如报错信息),会导致优化不充分
(2) 类似于Tensorflow、Keras等框架,会先转化为计算图Computation Graph形式,然后会基于图做一定的优化。但图阶段缺少硬件部署的相关信息,所以后续会转化为某个后端的内部表示,根据不同的硬件(TPU、Phone),进行算子融合等等优化。
· 可复用性差:针对不同种类IR开发的Pass(优化)可能重复,但不同IR的同类Pass可能并不兼容。
· 不透明:前层IR所作的Pass优化在后层中不可见,可能导致优化重复。
· 变换开销大:转换过程中存在多种IR,这些不同类型的IR转换时开销很大。
1.3 MLIR的提出
Tensorflow 团队较早时采用了多种IR的部署,这样导致软件碎片化较为严重。
因此 Tensorflow 团队就提出了 MLIR,主要是为了统一各类IR格式,协调各类IR的转换,带来更高的优化效率

注: SSA是静态单赋值,后面会讲到
2. Dialect及Operation详解
2.1 Dialect
1. Dialect 是什么?
从源程序到目标程序,要经过一系列的抽象以及分析,通过 Lowering Pass 来实现从一个IR到另一个IR的转换。但IR之间的转换需要统一格式,统一IR的第一步就是要统一“语言”,各个IR原来配合不默契,谁也理解不了谁,就是因为“语言”不通。
因此 MLIR 提出了Dialect,各种IR可以转换为对应的 mlir Dialect,不仅方便了转换,而且还能随意扩展。不妨将dialect看成各种具有IR表达能力的黑盒子,之后的编译流程就是在各种dialect之间转化。


. dialect 是怎么工作的?
dialect 将所有的IR放在了同一个命名空间中,分别对每个IR定义对应的产生式并绑定相应的操作,从而生成一个MLIR的模型。
每种语言的 dialect(如tensorflow dialect、HLO dialect、LLVM IR dialect)都是继承自 mlir::Dialect,并注册了属性、操作和数据类型,也可以使用虚函数来改变一些通用性行为。
整个的编译过程:从源语言生成 AST(Abstract Syntax Tree,抽象语法树),借助 dialect 遍历 AST,产生 MLIR 表达式(此处可为多层IR通过 Lowering Pass 依次进行分析),最后经过 MLIR 分析器,生成目标硬件程序。

 3. dialect 内部构成

dialect主要是由自定义的 TypeAttributeInterface 以及 operation 构成。operation 细分为Attribute、Type、Constraint、Interface、Trait(属性、类型、限制、接口、特征)。同时存在 ODS 和 DRR 两个重要的模块,这两个模块都是基于 tableGen 模块,ODS 模块用于定义 operation ,DRR 模块用于实现两个 dialect 之间的 conversion

 2.2 Operation

Operation 是 Dialect 的重要组成部分,是抽象和计算的核心单元,可以看成是方言语义的基本元素。
生成的结果是 %t_tensor,xxx dialect,执行的是 transpose 操作,输入数据是 %tensor,能够将 tensor<2x3xf64> 的数据转换成tensor<3x2xf64> 的数据,该 transpose 的位置在 "example/file/path",第12行,第1个字符
IR 是 LLVM 的设计核心,它采用 SSA(Single-Static Assignments,静态单赋值)的形式,并具备两个重要特性: - 代码被组织成三地址指令 - 有无限的寄存器
(2)"xxx.transpose":操作的名称,应该是唯一的字符串,方言空间以.开头;指明为 xxx Dialect 的transpose 操作;.之前的内容是 Dialect 命名空间的名字,.后面是操作的名称。
(4){inplace = true}:属性字典,定义一个名为inplace的布尔类型,其常量值为true
(5)(tensor<2x3xf64>) -> tensor<3x2xf64>:函数形式表示的操作类型,前者是输入,后者是输出。<2x3xf64>号中间的内容描述了张量的尺寸2x3和张量中存储的数据类型f64,中间使用x连接。
(6)loc("example/file/path":12:1):此操作的源代码中的位置。每个操作都有与之关联的强制性源位置,在 MLIR 中是核心要求,并且 API 依赖并操纵他。例如:如果一个转换将操作替换成另一个操作,必须在新的操作中附加一个位置,可以追踪该操作的来源。所以,在使用工具链 mlir-opt 中默认没有这个位置信息,添加 -mlir-print-debuginfo 标志指定要包含位置。

3. 创建新的dialect(添加新的operation)
本节创建新的dialect包括 手动编写C++创建 以及 利用ODS框架生成
ODS 全称 Operation Definition Specification,操作者只需要根据 operation 框架定义的规范,在一个.td文件中填写相应的内容,使用 mlir 的 tableGen 工具就可以自动生成上面的 C++ 代码。
本节完全参考官方文档 :Chapter 2: Emitting Basic MLIR - MLIR (
本节将以xxx语言为例,演示构造 xxx Dialect并添加相应的Operation的流程。
xxx 语言具有以下特性:
- Mix of scalar and array computations, as well as I/O
- Array shape Inference
- Generic functions
- Very limiter set of operators and features
3.1 定义 xxx Dialect
Dialect 将对 xxx 语言的结构进行建模,并为高级分析和转换提供方便的途径。
1. 使用 C++ 语言手动编写
// 下面是官方给出的xxx Dialect定义,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/Dialect.h
class xxxDialect : public mlir::Dialect {
explicit XxxDialect(mlir::MLIRContext *ctx);

/// Provide a utility accessor to the dialect namespace.
static llvm::StringRef getDialectNamespace() { return "xxx"; }

/// An initializer called from the constructor of xxxDialect that is used to
/// register attributes, operations, types, and more within the xxx dialect.
void initialize();
2. 使用 ODS 框架自动生成
在使用 ODS 定义操作的这些代码,都在Ops.td中,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/
下面的代码块定义一个名字为 xxx 的 Dialect 在 ODS 框架中,使用let <...> = "..."/[{...}];方式依次明确 name、summary、description 和 cppNamespace(对应 Dialect 类所在的 C++ 命名空间)各个字段的定义。
def xxx_Dialect : Dialect {
// The namespace of our dialect, this corresponds 1-1 with the string we
// provided in `XxxDialect::getDialectNamespace`.
let name = "xxx";

// A short one-line summary of our dialect.
let summary = "A high-level dialect for analyzing and optimizing the "
"xxx language";

// A much longer description of our dialect.
let description = [{
The xxx language is a tensor-based language that allows you to define
functions, perform some math computation, and print results. This dialect
provides a representation of the language that is amenable to analysis and

// The C++ namespace that the dialect class definition resides in.
let cppNamespace = "xxx";
然后在编译阶段,由框架自动生成相应的 C++ 代码。当然也可以运行下面的命令 直接得到生成的 C++ 代码。
${build_root}/bin/mlir-tblgen -gen-dialect-decls ${mlir_src_root}/examples/xxx/Ch2/include/xxx/ -I ${mlir_src_root}/include/
下图中右侧是 ODS 中的定义,左侧是自动生成的 C++ 代码。


3.2 加载到 MLIRContext 中
定义好 Dialect 之后,需要将其加载到 MLIRContext 中。默认情况下,MLIRContext 只加载内置的 Dialect,若要添加自定义的 Dialect,需要加载到 MLIRContext。
// 此处的代码与官方文档中的稍有不同,但实际意义相同。
// 在代码文件 xxxc.cpp 中,默认位置为 ../mlir/examples/xxx/Ch2/xxxc.cpp。
int dumpMLIR() {
// Load our Dialect in this MLIR Context.
3.3 定义 operation
有了上述的 xxx Dialect,便可以定义操作(operation)。官方文档围绕 xxx xxx.ConstantOp 的定义介绍如何使用 C++ 的方式直接定义 operation。
# 此操作没有输入,返回一个常量。
%4 = "xxx.constant"() {value = dense<1.0> : tensor<2x3xf64>} : () -> tensor<2x3xf64>
1. 使用 C++ 语言手动编写
operation 类是继承于 CRTP 类,有一些可选的 traits 来定义行为。下面是 ConstantOp 的官方定义:
// `mlir::Op` is a CRTP class
class ConstantOp : public mlir::Op<
ConstantOp, // The ConstantOp
mlir::OpTrait::ZeroOperands, // takes zero input operands
mlir::OpTrait::OneResult, // returns a single result.
mlir::OpTraits::OneTypedResult<TensorType>::Impl> {
// Op inherit the constructors from the base Op class.
using Op::Op;
// Return a unique name of the operation
static llvm::StringRef getOperationName() { return "xxx.constant"; }
// Return a value by fetching it from the attribute
mlir::DenseElementsAttr getValue();
// Operations may provide additional verification beyond what the attached traits provide.
LogicalResult verifyInvariants();

// Provide an interface to build this operation from a set of input values.
// mlir::OpBuilder::create<ConstantOp>(...)
// Build a constant with the given return type and `value` attribute.
static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
mlir::Type result, mlir::DenseElementsAttr value);
// Build a constant and reuse the type from the given 'value'.
static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
mlir::DenseElementsAttr value);
// Build a constant by broadcasting the given 'value'.
static void build(mlir::OpBuilder &builder, mlir::OperationState &state,
double value);
定义好 operation 的行为后,我们可以在 xxx Dialect 的 initialize 函数中注册(register),之后才可以正常在 xxx Dialect 中使用 ConstantOp。
// 位于../mlir/examples/xxx/Ch2/mlir/Dialect.cpp
void XxxDialect::initialize() {
2. 使用 ODS 框架自动生成
首先在 ODS 中定义一个继承自 Op 类的基类 xxx_Op。
Operation 和 Op的区别
Op:每种特定的操作都是由 Op 类继承来的。同时它还是 Operation * 的 wrapper,这就意味着,当我们定义一个 Dialect 的 Operation 的时候,我们实际上是在提供一个 Operation 类的接口。
Op 类的定义在 文件中,默认位置为 ../mlir/include/mlir/IR/。
下面的代码都在Ops.td中,默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/
class xxx_Op<string mnemonic, list<OpTrait> traits = []> :
Op<xxx_Dialect, mnemonic, traits>;
// xxx_Dialect : 父类 Dialect 操作
// mnemonic : 注记符号,一般是一个字符串型的单词,代表了该操作的含义
// traits : 该操作的一些特征,放在一个列表中
def ConstantOp : xxx_Op<"constant", [NoSideEffect]> {
// "constant"就是注记符号,[NoSideEffect]说明了该操作的一个特点
// Provide a summary and description for this operation.
let summary = "constant";
let description = [{
Constant operation turns a literal into an SSA value. The data is attached
to the operation as an attribute. For example:
%0 = xxx.constant dense<[[1.0, 2.0, 3.0], [4.0, 5.0, 6.0]]>
: tensor<2x3xf64>

let arguments = (ins <data_type><data_attribute>:$<variable_name>);
- ins: 输入 (results中该参数为 outs)
- <data_type>: 数据类型
- <data_structure>: 数据属性
- ElementsAttr: 稠元(dense element)
- <variable_name>: 变量名
// The constant operation takes an attribute as the only input.
// `F64ElementsAttr` corresponds to a 64-bit floating-point ElementsAttr.
let arguments = (ins F64ElementsAttr:$value);
// The constant operation returns a single value of TensorType.
let results = (outs F64Tensor);

// Divert the printer and parser to `parse` and `print` methods on our operation.
let hasCustomAssemblyFormat = 1;
// 自定义程序的组装格式,使最终输出的 IR 格式更精简、易读
let parser = [{ return ::parseConstantOp(parser, result); }];
let printer = [{ return ::print(p, *this); }];

// ODS 可以自动生成一些简单的构建方法,用户也可自定义添加一些构造方法
let builders = [
// Build a constant with a given constant tensor value.
OpBuilderDAG<(ins "DenseElementsAttr":$value), [{
build($_builder, $_state, value.getType(), value);
// Build a constant with a given constant floating-point value.
OpBuilderDAG<(ins "double":$value)>

// Add additional verification logic to the constant operation.
// will generate a `::mlir::LogicalResult verify()`
let hasVerifier = 1;
然后在编译阶段,由框架自动生成相应的 C++ 代码。当然也可以运行下面的命令 直接得到生成的 C++ 代码。
${build_root}/bin/mlir-tblgen -gen-op-defs ${mlir_src_root}/examples/xxx/Ch2/include/xxx/ -I ${mlir_src_root}/include/
下图中右侧是 ODS 中的定义,左侧是自动生成的 C++ 代码。


 3.4 创建流程总结(使用ODS)

整个 tableGen 模块是基于 ODS (Operation Definition Specification)框架进行编写以及发挥作用。tableGen 模块促进了自动化生成,减少了 operation 的手动开发,并且避免了冗余开发。
我们以添加 xxx Dialect为例,总结添加流程如下:
Ops.td文件默认位置为 ../mlir/examples/xxx/Ch2/include/xxx/

① (在Ops.td中) 定义一个和 xxx Dialect 的链接
def xxx_Dialect : Dialect {
let name = "xxx";
let cppNamespace = "xxx";
② (在Ops.td中) 创建 xxx Dialect Operation 基类
class xxx_Op<string mnemonic, list<OpTrait> traits = []> :
Op<xxx_Dialect, mnemonic, traits>;
③ (在Ops.td中) 创建 xxx Dialect 中各种 Operation
def ConstantOp : xxx_Op<"constant", [NoSideEffect]> {
let summary = "constant";
let arguments = (ins F64ElementsAttr:$value);
let results = (outs F64Tensor);
let builders = [
OpBulider<"Builder *b, OperationState &state, Value input">
let verifier = [{ return ::verify(*this); }];
④ 通过 mlir-tblgen 工具生成 C++ 文件
使用 mlir-tblgen -gen-dialect-decls 命令生成对应的 文件。
使用 mlir-tblgen -gen-op-defs 命令生成对应的 文件。

 使用 #include 直接引用生成文件

posted @   吴建明wujianming  阅读(977)  评论(0编辑  收藏  举报
