MLTree Extensions
Why ExtensionsPattern matching over the MLTREE intermediate representation may not be sufficient to provide access to all the registers or operations provided on a specific architecture. MLTREE extensions is a method of extending the MLTREE intermediate language so that it is a better match for the target architecture.For example there may be special registers to support the increment-and-test operation on loop indices, or support for complex mathematical functions such as square root, or access to hardware specific registers such as the current register window pointer on the SPARC architecture. It is not usually possible to write expression trees that would directly generate these instructions. Some complex operations can be generated by performing a peephole optimization over simpler instructions, however this is not always the most convenient or simple thing to do. Cyclic DependencyThe easiest way to provide extensions is to parameterize the MLTREE interface with types that extend the various kinds of trees. Thus if the type sext represented statement extensions, we might define MLTREE statement trees as :
datatype stm
= ...
| SEXT of sext * mlrisc list * stm list
and mlrisc = GPR of rexp | FPR of fexp | CCR of ccexp
where the constructor SEXT applies the extension to a list of
arguments. This approach is unsatisfactory in several ways, for
example, if one wanted to extend MLTREEs with for-loops, then the
following could be generated:
SEXT(FORLOOP, [GPR from, GPR to, GPR step], body)First, the loop arguments have to be wrapped up in GPR and there is little self documentation on the order of elements that are arguments to the for-loop. It would be better to be able to write something like:
SEXT(FORLOOP{from=f, to=t, step=s, body=b})
Where f, t, and s are rexp trees representing
the loop index start, end, and step size; b is a stm list
representing the body of the loop. Unfortunately, there is a cyclic
dependency as MLTREEs are defined in terms of sext, and {
sext} is defined in terms of MLTREEs. The usual way to deal with
cyclic dependencies is to use polymorphic type variables.
MLTREE EXTENSIONThe statement extension type sext, is now a type constructor with arity four, i.e. ('s, 'r, 'f, 'c) sx where sx is used instead of { sext}, and 's, 'r, 'f, and 'c represents MLTREE statement expressions, register expressions, floating point expressions, and condition code expressions. Thus the for-loop extension could be declared using something like:
datatype ('s,'r,'f,'c) sx
= FORLOOP of {from: 'r, to: 'r, step: 'r, body: 's}
and the MLTREE interface is defined as:
signature MLTREE = sig
type ('s, 'r, 'f, 'c) sx
datatype stm =
= ...
| SEXT of sext
withtype sext = (stm, rexp, fexp, cexp) sx
end
where sext is the user defined statement extension but the
type variables have been instantiated to the final form the the MLTREE
stm, rexp, fexp, and cexp components.
CompilationThere are dedicated modules that perform pattern matching over MLTREEs and emit native instructions, and similar modules must be written for extensions. However, the same kinds of choices used in regular MLTREE patterns must be repeated for extensions. For example, one may define an extension for the Intel IA32 of the form:
Each extension must provide a function compileSext that compiles
a statement extension into native instructions. In the
MLTREE_EXTENSION_COMP interface we have:
Multiple ExtensionsMultiple extensions are handled in a similar fashion, except that the extension type used to specialize MLTREEs is a tagged union of the individual extensions. The functor to compile the extension dispatches to the compilation modules for the individual extensions.ExampleSuppose you are in the process of writing a compiler for a digital signal processing(DSP) programming language using the MLRISC framework. This wonderful language that you are developing allows the programmer to specify high level looping and iteration, and aggregation constructs that are common in DSP applications. Furthermore, since saturated and fixed point arithmetic are common constructs in DSP applications, the language and consequently the compiler should directly support these operators. For simplicity, we would like to have a unified intermediate representation that can be used to directly represent high level constructs in our language, and low level constructs that are already present in MLTree. Since, MLTree does not directly support these constructs, it seems that it is not possible to use MLRISC for such a compiler infrastructure without substantial rewrite of the core components.
Let us suppose that for illustration that we would like to
implement high level looping and aggregation constructs such as
Here is a first attempt:
The following is an example of how these new constructors that we have defined can be used. Suppose the source program in our DSP language is:
|