CS[ Bristol CS | Index ]

The Brisk Kernel Language

This is a working paper describing the Brisk Kernel Language. The Brisk Kernel Language is a very lowel level functional language intended as a target for compiling. It strongly resembles the STG language used for the same purpose in the Glasgow ghc compiler, but is lower level still. The purpose is to allow more low level transformations in the compiler, to make the run time system and code generator simpler, and to make it easier to add experimental extensions to the language.

Introduction

The Brisk Kernel Language, or BKL, is based on the STG language used in the Glasgow compiler [1], but is lower level. The issues of how to compile into BKL, and how to generate code from it, are discussed very briefly to motivate the description of BKL, but are dealt with more fully elsewhere.

The BKL Grammar

The BKL language can be regarded as a minimal subset of Haskell [2]. A simplified syntax for BKL is given here, in which various issues such as modules, types and literals are omitted in order to concentrate on the essential features:
   prog   --->  decl ";" ... ";" decl                 ( >= 1 )

   decl   --->  fun arg ... arg "=" expr              ( >= 0 )

   expr   --->  let bind ";" ... ";" bind "in" expr   ( >= 1 )
           |    "case" var "of" alt ";" ... ";" alt   ( >= 1 )
           |    appl

   bind   --->  var "=" appl

   alt    --->  con arg ... arg "->" expr             ( >= 0 )
           |    var "->" expr

   appl   --->  fun arg ... arg                       ( >= 0 )

   fun    --->  var
   con    --->  var
   arg    --->  var
The BKL language assumes that evaluation is handled explicitly using primitive functions to be described later. A global function definition represents a single piece of code containing no sub-evaluations, making code generation relatively easy.

The local definitions in a let expression correspond to building suspensions. A case expression corresponds to a simple switch; the case variable is assumed to have been explicitly evaluated beforehand. An application expression corresponds to a tail call.

No distinction is made in the syntax between defined functions, constructor functions, and primitive functions; it is assumed here that they are handled uniformly for simplicity.

There are restrictions on BKL programs which are not captured by this syntax. The main restrictions concern arities and strictness, and these are described in more detail below.

Arities

In BKL, each function has an arity determined by its definition, and in every application of the function, the correct number of arguments must be supplied. The arity depends not just on the conventional Haskell type of a function, but also on the form of the function's definition. For example, suppose we have Haskell definitions:
   f x y = x + y

   g x = \y -> x + y
Assuming that the left hand sides of these definitions are not affected by the compiler during the transformation into BKL, the first will have arity 2 and the second will have arity 1, despite the fact that f and g have the same type. We will indicate this using type declarations which use conventional Haskell syntax, but in which bracketing is taken to be significant. The arity of a function is indicated by the number of unbracketed -> operators in its type. For example, the functions f and g have types:
   f :: Int -> Int -> Int

   g :: Int -> (Int -> Int)
The code generator produces code for a function which matches its arity information. The code for f takes two arguments and returns an integer, whereas the code for g takes one argument and returns a function.

This arity restriction also holds for higher order functions. If the function map has type:

   map :: (a->b) -> [a] -> [b]
then the first argument in every call to map must be a function with arity 1, and code is generated for map which assumes this.

Strictness

In BKL, some subexpressions are assumed to have been evaluated, ie to be in head normal form. One restriction which we have already mentioned is that the variable in a case expression is assumed to be evaluated, which corresponds to the fact that all evaluation is explicit.

The second main restriction is that in any function application, including suspensions on the right hand side of local let definitions, the function must be known to be evaluated. The motivation is that we want an application f x y to be represented by a 3-word heap node in which the first word points to a node representing f which also acts as a descriptor for the application node. For this, we need f to be an evaluated function node with a special format and not a suspension node.

To express these restrictions more precisely, we will assume that the type system is extended so that every normal type t has a corresponding strict type written !t representing expressions of type t which are known to be in head normal form. This notation is already used in Haskell for the argument types of strict constructors. A similar notation is used to describe unboxing in the Glasgow compiler [3], and many of the issues are the same, but we will assume for the moment that all values are boxed.

Explicit evaluation is carried out in BKL by calling one of a family of primitive functions akin to the standard strict function in Haskell. For example:

   strict01 :: !(!(!a -> b) -> a -> b)

The call strict01 f x first evaluates x and then makes the call f x in which the code for f can assume that x has been evaluated. In the type, the first ! indicates that the function strict01 itself is known to be in evaluated form. The second specifies that the function f must be evaluated before passing it to strict01, and the third indicates that f expects x to be evaluated before it is called.

The names of the functions in the family consist of strict followed by a number of 0s and 1s, one for the function and one for each of its arguments, to specify whether the item is to be evaluated or not. Another example is:

   strict11 :: !((!a -> b) -> a -> b)
Here, in a call strict11 f x, the function f expects an evaluated argument, but is not itself known to be in evaluated form, so strict11 evaluates both f and x before making the call.

The strict family of functions can be used to make definitions such as:

   (+) :: !(Int -> Int -> Int)
   x + y = strict011 (!+) x y

   (!+) :: !(!Int -> !Int -> Int)
   x !+ y = primitive ...

The function (!+) is a primitive one which expects its arguments to be evaluated by its caller. In general, we can arrange that primitive functions do not need to make calls to evaluate arguments.

As a further example, the definition of the standard map function can be translated into BKL as:

   map f xs = strict001 map' f xs

   map' f xs = case xs of
      [] -> []
      y : ys -> let z = strict10 f y
                    zs = map f ys
                in z : zs
The function strict001 is used to evaluate the list argument and pass the result to map' which does a case analysis on it. In building a suspension for f y, the function f is not known to be evaluated, and laziness dictates that it should not be evaluated before it is called, so the suspension strict10 f y is created.

If the notation !t were to be made available to programmers, there would need to be restrictions placed on it so that the type-checker could prevent any misuse. The main restrictions would be that polymorphic type variables should not be allowed to range over strict types, and that results of functions should not have strict types. These two restrictions ensure that suspensions do not have strict types. For example, consider the expression:

   let x = id y in ...
Since x is represented as a suspension, it should not have a strict type. The identity function id has polymorphic type a -> a, so if y were allowed to have a strict type, then x could be given a strict type. The restrictions prevent this.

The notation for strict types can be used for unboxing as well as for expressing restrictions in evaluation ordering, in essentially the same manner as in the Glasgow compiler. For each strict type, a decision can be made as to how values of that type are represented, as a number of words holding raw data and/or a number of words pointing to heap nodes. Then, for each function, it is known what pattern of raw and pointer words to expect in the arguments to every call. The code generator can arrange for these to be shuffled so that the raw words occur before the pointer words, allowing a relatively simple and uniform run time representation. If the Int type is defined by:

   data Int = MkInt !Int
and !Int is unboxed, (!+) can be defined by:
   (!+) :: !(!Int -> !Int -> Int)
   x !+ y = let z = x !+! y in MkInt z

   (!+!) :: !(!Int -> !Int -> !Int)
   x !+! y = primitive ...
The function (!+!) adds unboxed integers and can be replaced by inline code. It violates the restriction that functions should not return strict types, but we can relax the restriction for primitive functions which are inlined.

Compiling into the Kernel Language

In compiling programs into BKL, the compiler has to choose an arity for every function, and then ensure that every call to it has the right number of arguments. This involves creating explicit partial applications. For example, if f has arity 2, then an expression f x must be transformed, eg by:
   f x    --->    pap1of2 f x
The primitive function pap1of2 takes a function of arity 2 and its first argument and returns a function of arity 1 which expects the second argument. The compiler must also deal with over-applications. For example, if f has arity 2 and returns a function as its result, then an expression f x y z must be transformed, eg by:
   f x y z    --->    let g = f x y in strict10 g z
The compiler also needs to ensure that functions of the right arity are passed to higher order functions. Thus map (+) xs would have to be transformed:
   map (+) xs    --->    let f x = \y -> x+y in map f xs
The compiler also needs to make evaluation explicit. Conventional transformations can be used to leave a situation where only the case construct indicates evaluations. Functions involving case can be split in two and explicit calls to members of the strict family can be added. For example:
   f x = ... case e of ...      --->      f x = strict001 f' x e
                                          f' x y = ... case y of ...
It may be that various items besides e have to be added as extra arguments to f' and passed to it by f. This corresponds directly to the usual stack frame analysis which determines which items need to be saved and restored across an evaluation call.

Another important issue for the compiler is lifting. Suppose that there is a local function definition:

   ... let f x y = ... g ... h ... in e
In the conventional approach to lifting, the free variables (or maximal free subexpressions) g amd h in the body of f would be added as extra arguments and f would be made global to give:
   ... let f = f' g h in e

   f' g h x y = ... g ... h ...
There is a variation on lifting described by Peyton Jones and Lester [4] in which the code for f' gets its free variables from a heap node and its original arguments x and y from an argument stack.

This idea fits in well with a feature of the Brisk run time system in which each global function node contains references to the other global functions it references. In many compilers, this is not done because it is assumed that global functions are fixed and can refer to each directly. However, in Brisk, things may be more dynamic; new functions may appear in the heap at any time as a result of dynamic compilation and loading, or may be passed from process to process in a distributed program. Thus functions are represented as nodes in the heap, and refer to each other via pointers.

In the context of lifting, exactly the same mechanism can be used to create dynamically a function with a given collection of references. This can be expressed by extending the BKL language so that the lifted version of the example above can be expressed as, eg:

   ... let f = f' (g,h) in e

   f' (g,h) x y = ... g ... h ...
The definition of f' indicates that code should be generated which assumes that g and h appear in the function node, while the local definition of f indicates that a function node should be created with suitable references.

Optimisation

It is not the intention here to describe the run time system in detail; we do that elsewhere. However, we want to demonstrate that despite the simplicity of the run time system, many of the most important common optimisations can still be achieved.

The run time environment consists of a heap and a stack. The heap contains nodes which correspond in a very direct way to graph reduction of shared expressions. The stack records the path from the root node of the program down to the subexpression which is currently under evaluation. Each stack entry records a heap node, and the position in it at which the pointer to the next subexpression appears.

The arity restrictions and explicit nature of evaluation mean that sometimes a call and return are needed to evaluate a function. It is common in other compilers to avoid this, implementing a call to an unevaluated function simply as a jump. However, there are two potential efficiency advantages to the Brisk approach. The first is that complications to do with dynamically created partial applications are avoided. For example, in the Glasgow compiler, most functions begin with code to test whether there are enough arguments on the argument stack; if not, a partial application is created before re-trying. The Brisk approach avoids such tests. The second advantage is that since partial applications are created explicitly by the compiler, they can in principle be manipulated and optimised, eg lifting them out of loops. This makes it difficult to judge which approach is more efficient, but the conceptual simplicity is a great advantage in adding expreimental features to Brisk.

Another issue is that of intermediate expressions. Suppose that an expression f x y evaluates to g z and then h u v w before reaching head normal form as s:t. In what has been described so far, each of these intermediate expressions is represented directly as a heap node, leading to a high turnover of heap space. Many compilers represent these intermediate expressions as temporary items on stacks or in registers. Fortunately, a simple trick can be used to recover the situation, while retaining the simple and uniform treatment of representing every subexpression as a heap node. When a function does a tail call to another function which is not a constructor, it creates a temporary node instead of a permanent one. In other words, it allocates and fills in the node as usual, but does not advance the heap pointer. When the temporary node is evaluated, it re-uses the same space for its own results. For safety, the code for a constructor, or for any other function which assumes that its call node is going to continue to exist, makes a quick register-register test to see if the call node is a temporary one, and if so makes the node permanent. The same can be done at any interruption of execution such as a garbage collection.

It is possible to express sharing and updating in terms of a primitive share function which can be applied to expressions which may need to be shared. For example, we might write:

   let x = share x'; x' = f y z in ...
The purpose of share is to put an update frame on the stack before continuing with evaluation of the argument x' so that when the evaluation returns, the node representing share x' can be over-written with an indirection to the result.

However, this represents rather a high space overhead; two extra words for every suspension. In the absence of any sophisticated sharing analysis, every unevaluated node, ie every node which does not represent a constructor call, has to be assumed to need sharing. This makes it attractive to forget update frames and build updating into the return mechanism.

The standard return mechanism uses the top entry on the stack to find out which node caused evaluation of the current subexpression, and which word in that node points to the original version of the subexpression. That word is then updated to point to the result, the stack entry is popped, and evaluation of the parent node is resumed.

Updating can be added to the return mechanism by over-writing the original version od the subexpression with an indirection to the result, after a quick comparison test to ensure that the result node is actually different from the original one. The overhead of this test is compensated for by the fact that no time or space is wasted in constructing update frames.

Bibliography

  1. SL\ Peyton Jones ``Implementing lazy functional languages on stock hardware: the Spineless Tagless G-machine'', Journal of Functional Programming 2(2), Apr 1992, pp127-202.
  2. ``Haskell: A Purely Functional Language'', Web site describing Haskell at http://haskell.org/ (version 1.4 at time of writing).
  3. SL~Peyton Jones and J~Launchbury, ``Unboxed values as first class citizens'', Functional Programming Languages and Computer Architecture (FPCA'91), Boston, LNCS 523, Springer Verlag, Sept 1991, pp636-666.
  4. SL Peyton Jones and D Lester, ``A modular fully-lazy lambda lifter in Haskell'', Software Practice and Experience 21(5), May 1991, pp479-506.

Dr. Ian Holyer, ian@cs.bris.ac.uk. Last modified on Tuesday 23 September 1997 at 11:57.