`Bindlib`

The `Bindlib`

library provides support for free and bound variables in the OCaml language. The main application is the representation of types with a binding structure (e.g., abstract syntax trees).

The `Bindlib`

library provides two type constructors for building abstract syntax trees: type `'a var`

representing a free variable of type `'a`

, and type `('a,'b) binder`

representing a binding of a variable of type `'a`

in a value of type `'b`

. Intuitively, type `('a,'b) binder`

can be thought of as `'a -> 'b`

. Types `'a mvar`

and `('a,'b) mbinder`

are also provided for handling arrays of variables more efficiently.

`type 'a mvar = 'a var array`

Type of an array of variables of type `'a`

.

Type of a binder for a variable of type `'a`

into an element of type `'b`

. In terms of higher order abstract syntax, it can be seen as `'a -> 'b`

.

Type of a binder for an array of variables of type `'a`

into an element of type `'b`

. This representation is more efficient than a nesting of binders of type `('a,'b) binder`

, but substitution can only be performed the whole array of variables at once.

As an example, a type representing the terms of the pure λ-calculus can be defined as follows using types `'a var`

and `('a,'b) binder`

.

```
type term =
| Var of term var
| Abs of (term, term) binder
| App of term * term
```

`val subst : ('a, 'b) binder -> 'a -> 'b`

`subst b v`

substitutes the variable bound by `b`

with the value `v`

. This operation is very efficient.

`val msubst : ('a, 'b) mbinder -> 'a array -> 'b`

`msubst b vs`

substitutes the variables bound by `b`

with the values `vs`

. This operation is very efficient. Note however that the length of the `vs`

array should match the arity of `b`

(as given by `mbinder_arity b`

). If it is not the case, the exception `Invalid_argument "Bad arity in msubst"`

is raised.

Coming back to our pure λ-calculus example, call-by-name evaluation can be defined as a simple recursive function using `subst`

.

```
let rec eval : term -> term = fun t ->
match t with
| App(f,a) ->
begin
match eval f with
| Abs(b) -> eval (subst b a)
| _ -> t
end
| _ -> t
```

`new_var mkfree name`

creates a new variable using a function `mkfree`

and a `name`

. The `mkfree`

function is used to inject variables in the type of the corresponding objects: it is a form of syntactic wrapper that is often defined to be a constructor (see the example below).

`new_mvar mkfree names`

creates an array of new variables using a function `mkfree`

(see `new_var`

) and an array of variable `names`

.

For our pure λ-calculus example, where variables have type `term var`

, the `mkfree`

function would simply be defined as follows.

`let mkfree : term var -> term = fun x -> Var(x) `

`val name_of : 'a var -> string`

`name_of x`

returns the name corresponding to variable `x`

. Note that this name is generally not safe for printing since names are not updated when a substitution is performed. For instance, variables may appear to have been captured in the text representation of a structure with variable bindings. To circumvent this issue, one must be particularly careful when converting binders into text, and rely on contexts (see type `ctxt`

below).

`val names_of : 'a mvar -> string array`

`names_of xs`

returns names corresponding to variables `xs`

. As when using `name_of`

, care should be taken when converting objects with bindings into a text representation.

`unbind b`

substitutes the binder `b`

using a fresh variable. The variable and the result of the substitution are returned. Note that the name of the fresh variable is based on that of the binder, but it is not guaranteed to be safe for printing (see `new_var`

and `unbind_in`

). However, there is no problem for other operations such as re-establishing the binding. When the fresh variable is created, the `mkfree`

function that is used is that that was specified when creating the variable that was originally bound by `b`

, at the time of its creation (see `new_var`

and `bind_var`

).

`unbind2 f g`

is similar to `unbind f`

, but it substitutes two binders `f`

and `g`

at once using the same fresh variable. The name of the variable is based on that of the binder `f`

. Similarly, the `mkfree`

syntactic wrapper that is used for the fresh variable is the one that was given for creating the variable that was bound to construct `f`

(see `bind_var`

and `new_var`

for details on this process). In particular, the use of `unbind2`

may lead to unexpected results if the binders `f`

and `g`

were not built using free variables created with the same `mkfree`

. Moreover, as with `unbind`

, the name of the fresh variable is not guaranteed to be safe for printing.

`eq_binder eq f g`

tests the equality between `f`

and `g`

. The binders are first substituted with the same fresh variable (using `unbind2`

), and `eq`

is called on the resulting values. Note that `eq_binder`

may not have the expected result if `f`

and `g`

were not built by binding variables with an identical `mkfree`

syntactic wrapper.

`unmbind b`

substitutes the multiple binder `b`

with fresh variables. This function is analogous to `unbind`

for binders. Note that the names used to create the fresh variables are based on those of the multiple binder. The syntactic wrapper (of `mkfree`

) that is used to build the variables is the one that was given when creating the multiple variables that were bound in `b`

(see `new_mvar`

and `bind_mvar`

). Moreover, note that as for `unbind`

, the names of the fresh variables may not be safe for printing.

`unmbind2 f g`

is similar to `unmbind f`

, but it substitutes two multiple binder `f`

and `g`

at once, using the same fresh variables. Note that the two binders must have the same arity. This function may have an unexpected results in some cases, and the fresh variables may have names that are not safe for printing (see `unbind2`

).

`eq_mbinder eq f g`

tests the equality of the two multiple binders `f`

and `g`

. They are substituted with the same fresh variables (using `unmbind2`

) and `eq`

is called on the resulting values. This function may not have the expected result in some cases, for reasons explained in the documentation of `eq_binder`

. It is safe to use this function on multiple binders with a different arity (they are considered different).

The `unbind`

function is often useful for term traversals. For instance, a function computing the size of a pure λ-term can be defined as follows.

```
let rec size : term -> int = fun t ->
match t with
| Var(_) -> 0
| Abs(b) -> let (_,t) = unbind b in
1 + size t
| App(t,u) -> 1 + size t + size u
```

Note however that it is not a good idea to define a term-printing function using a combination of `unbind`

and `name_of`

. Indeed, as explained in the documentation of `name_of`

, contexts must be used to ensure that displayed variable names are correct.

To obtain fast substitutions, a price must be paid at the construction of terms. Indeed, binders (i.e., element of type `('a,'b) binder`

) cannot be defined directly. Instead, they are put together in the type `'a box`

. It correspond to a term of type `'a`

whose free variables may be bound.

Type of a term of type `'a`

under construction. Using this representation, the free variables of the term can be bound easily.

`box_var x`

injects variable `x`

into the `'a box`

type, so that it can be bound using `bind_var`

.

`val box : 'a -> 'a box`

`box e`

injects the value `e`

into the `'a box`

type, assuming that it is closed. Thus, if `e`

contains variables, then they will not be considered free. This means that no variable of `e`

will be available for binding.

`apply_box bf ba`

applies the boxed function `bf`

to a boxed argument `ba`

inside the `box`

type. This function is used to build new expressions by applying a function with free variables to an argument with free variables (the `'a box`

type is an applicative functor whose application operator is `apply_box`

, and whose unit is `box`

).

`box_apply f ba`

applies the function `f`

to a boxed argument `ba`

. It is equivalent to `apply_box (box f) ba`

, but is more efficient.

`box_apply2 f ba bb`

applies the function `f`

to two boxed arguments: `ba`

and `bb`

. It is equivalent to `apply_box (apply_box (box f) ba) bb`

but it is more efficient.

`bind_var x b`

binds the variable `x`

in `b`

, producing a boxed binder.

`bind_mvar xs b`

binds the variables of `xs`

in `b`

to get a boxed binder. It is the equivalent of `bind_var`

to build a multiple binder.

`box_binder f b`

boxes the binder `b`

using the boxing function `f`

. Note that when `b`

is closed, it is immediately boxed using the `box`

function. In that case, the function `f`

is not used at all.

`box_mbinder f b`

boxes the multiple binder `b`

using the boxing function `f`

. Note that if `b`

is closed then it is immediately boxed (with `box`

), without relying on `f`

at all.

As mentioned earlier, terms with bound variables can only be built in the `'a box`

type. To ease the construction of terms, it is a good practice to implement “smart constructors” at the `'a box`

level. Coming back to our λ-calculus example, we can give the following smart constructors.

```
let var : term var -> term box =
fun x -> box_var x
let abs_raw : (term, term) binder box -> term box =
fun b -> box_apply (fun b -> Abs(b)) b
let abs : term var -> term box -> term box =
fun x t -> abs_raw (bind_var x t)
let app : term box -> term box -> term box =
fun t u -> box_apply2 (fun t u -> App(t,u)) t u
```

Additionally, it is a good idea to define a “boxing function”: a function used to turn terms into a boxed terms. It is sometimes necessary to do so, as we will see when we define a printing function.

```
let rec box_term : term -> term box = fun t ->
match t with
| Var(x) -> var x
| Abs(b) -> abs_raw (box_binder box_term b)
| App(t,u) -> app (box_term t) (box_term u)
```

`val unbox : 'a box -> 'a`

`unbox e`

can be called when the construction of a term is finished (i.e., when the desired variable bindings have been created).

We can then easily define terms of the λ-calculus as follows.

```
(* λx.x *)
let id : term =
let x = new_var mkfree "x" in
unbox (abs x (var x))
(* λx.λy.x *)
let fst : term =
let x = new_var mkfree "x" in
let y = new_var mkfree "y" in
unbox (abs x (abs y (var x)))
(* λx.(x) x) (boxed) *)
let delta : term box =
let x = new_var mkfree "x" in
abs x (app (var x) (var x))
(* (λx.(x) x) λx.(x) x *)
let omega : term =
unbox (app delta delta)
(* λx.(x) x) *)
let delta : term =
unbox delta
```

For variable substitution to be as fast as possible, the `Bindlib`

library does not do any work to maintain variable names at substitution time. This work is instead delayed until it becomes necessary: at the time of turning objects with binders into a textual representation (e.g., for printing the result of a computation). Such operations must hence maintain a context of variable names using the functions of this section.

`val empty_ctxt : ctxt`

`empty_ctxt`

denotes the empty context.

`reserve_name name ctxt`

extends context `ctxt`

by reserving variable name `name`

.

`new_var_in ctxt mkfree name`

is similar to `new_var mkfree name`

, but the name actually used for the newly created variable is chosen to not collide with the variables of context `ctxt`

. Said otherwise, argument `name`

only gives a preferred name: if it is not available then a fresh name is picked by appending a decimal number at the end of `name`

. Moreover, the obtained variable has a name that is safe for printing (see `name_of`

), at least as long as binders are not substituted in objects containing it. Finally, the context that is returned is extended to contain the new variable name.

`new_mvar_in ctxt mkfree names`

is similar to `new_mvar mkfree names`

, but it handles renaming based on context `ctxt`

(see `new_var_in`

).

`unbind_in ctxt b`

is similar to `unbind b`

, but it handles the context as explained in the documentation of `new_mvar_in`

. This function can be used for maintaining correct names in printing functions: it is safe to use the `name_of`

function on the returned variable (as long as no substitution is performed in the involved objects, see `new_var_in`

).

`unbind2_in ctxt f g`

is similar to `unbind2 f g`

, but handles the context as explained in the documentation of `new_mvar_in`

and `unbind_in`

.

`unmbind_in ctxt b`

is similar to `unmbind b`

, but it handles the context as is explained in the documentation of `new_mvar_in`

. As `unbind_in`

, the `unmbind_in`

function can be used to implement printing functions.

`unmbind2_in ctxt f g`

is similar to `unmbind2 f g`

, but it uses a context similrly to `new_mvar_in`

or `unmbind_in`

.

Going back to our λ-calculus example, the `unbind_in`

function can be used to implement the following function transforming a λ-term into a `string`

. Here, thanks to the use of `unbind_in`

, it is safe to rely on `name_of`

to print variables.

```
let rec to_string : ctxt -> term -> string = fun ctxt t ->
match t with
| Var(x) -> name_of x
| Abs(b) -> let (x,t,ctxt) = unbind_in ctxt b in
"λ" ^ name_of x ^ "." ^ to_string ctxt t
| App(t,u) -> "(" ^ to_string ctxt t ^ ") " ^ to_string ctxt u
```

The above `to_string`

must not only receive the term to display (i.e., the second argument), but also a context containing all free variables in the term. To avoid maintain such a context we can rely on function `free_vars`

together with the term boxing function `box_term`

defined earlier.

```
let to_string : term -> string = fun t ->
to_string (free_vars (box_term t)) t
```

In general, it is not difficult to use the `box`

and `apply_box`

functions to manipulate any kind of data in the `'a box`

type. However, working with these functions alone can be tedious. The following functions can be used to manipulate standard data types in an optimised way.

`box_rev_list bs`

is similar to `box_list bs`

, but the produced boxed list is reversed (it is hence more efficient).

`box_apply3`

is similar to `box_apply2`

.

`box_apply4`

is similar to `box_apply2`

and `box_apply3`

.

`box_pair ba bb`

is the same as `box_apply2 (fun a b -> (a,b)) ba bb`

.

`box_triple`

is similar to `box_pair`

, but for triples.

`module type Map = sig ... end`

Type of a module equipped with a `map`

function.

Functorial interface used to build lifting functions for any type equipped with a `map`

function. In other words, this function can be used to allow the permutation of the `'a box`

type with another type constructor.

`module type Map2 = sig ... end`

Type of a module equipped with a "binary" `map`

function.

Similar to the `Lift`

functor, but handles "binary" `map`

functions.

`val hash_var : 'a var -> int`

`hash_var x`

computes a hash for variable `x`

. Note that this function can be used with the `Hashtbl`

module.

`compare_vars x y`

safely compares `x`

and `y`

. Note that it is unsafe to compare variables using `Stdlib.compare`

.

`eq_vars x y`

safely computes the equality of `x`

and `y`

. Note that it is unsafe to compare variables with `Stdlib.(=)`

.

`val binder_name : ('a, 'b) binder -> string`

`binder_name b`

returns the name of the variable bound by binder `b`

. This name is generally not safe for printing, since it is not updated after the binder is created, even when a substitution is performed (see `unbind_in`

and `name_of`

to learn how to print terms with binders).

`val binder_occur : ('a, 'b) binder -> bool`

`binder_occur b`

returns a boolean indicating if the variable bound by `b`

occurs (i.e., is used). This is a constant time operation.

`val binder_constant : ('a, 'b) binder -> bool`

`binder_constant b`

is the same as `not (binder_occur b)`

.

`val binder_closed : ('a, 'b) binder -> bool`

`binder_closed b`

indicates whether the `b`

is closed (i.e., does not have any free variables). This is a constant time operation.

`val binder_rank : ('a, 'b) binder -> int`

`binder_rank b`

gives the number of free variables contained in `b`

. This is a constant time operation.

`val mbinder_arity : ('a, 'b) mbinder -> int`

`mbinder_arity b`

gives the arity of `b`

(i.e., the number of variables it is binding). This is a constant time operation.

`val mbinder_names : ('a, 'b) mbinder -> string array`

`mbinder_names b`

returns the names of the variables bound by the multiple binder `b`

as an array. Similarly the result of `binder_name`

, these names are not generally safe for printing.

`val mbinder_occurs : ('a, 'b) mbinder -> bool array`

`mbinder_occurs b`

returns an array of booleans indicating whether each of the variables that are bound occur (i.e., are used). It is a constant time operation.

`val mbinder_constant : ('a, 'b) mbinder -> bool`

`mbinder_constant b`

indicates whether the `mbinder`

`b`

is constant. This means that none of its variables are used.

`val mbinder_closed : ('a, 'b) mbinder -> bool`

`mbinder_closed b`

indicates whether the multiple binder `b`

is closed. It is a constant time operation.

`val mbinder_rank : ('a, 'b) mbinder -> int`

`mbinder_rank b`

gives the number of free variables contained in `b`

. This is a constant time operation.

`val is_closed : 'a box -> bool`

`is_closed b`

checks whether `b`

is closed (in constant time).

`occur x b`

indicates whether variable `x`

occurs in `b`

. This is done in linear time with respect to the number of free variables in `b`

.

`bind_apply bb barg`

is the same as `box_apply2 subst bb barg`

.

`mbind_apply bb bargs`

is the same as `box_apply2 msubst bb bargs`

.

The variable renaming performed by functions like `new_var_in`

is somewhat arbitrary, and may not fit every application. As a consequence, we provide a functor `Ctxt`

that can be used to define a new renaming policy based on a custom notion of context and several configuration options.

`module type Renaming = sig ... end`

Module type giving the specification of a renaming policy, to be used with the `Ctxt`

functor.

The renaming policy used by the default context-manipulating function like `new_var_in`

or `unbind_in`

uses the following configuration: the value of `reset_context_for_closed_terms = false`

, `skip_constant_binders = false`

, and `constant_binder_name = None`

.

The default `new_name`

function is not exposed in the interface. It splits the `name`

into a pair `(prefix,n)`

containing the largest `n`

and longest `prefix`

such that `name = prefix ^ string_of_int n`

. For instance, if we have `name = "xy023"`

then `(prefix,n) = ("xy0",23)`

. In the case where no `n`

exists, we take `(prefix,n) = (name,0)`

. The prefix is preserved, and only the value of `n`

is changed for renaming.

A functor that can be used to obtain context-manipulating functions, given the specification of a renaming policy. The defined `ctxt`

type as well as the obtained functions can then be used as a drop-in replacement for their default counterparts (found at the top level of the `Bindlib`

module).

`val uid_of : 'a var -> int`

`uid_of x`

returns the unique identifier of the given variable.

`val uids_of : 'a mvar -> int array`

`uids_of xs`

returns the unique identifiers of the variables of `xs`

.

`copy_var x mkfree name`

makes a copy of variable `x`

, with a potentially different name and `mkfree`

function. However, the copy is treated exactly as the original in terms of binding and substitution. The main application of this function is for translating abstract syntax trees while preserving binders. In particular, variables at two different types should never live together (this may produce segmentation faults).

`reset_counter ()`

resets the unique identifier counter on which `Bindlib`

relies. This function should only be called when previously generated data (e.g., variables) cannot be accessed anymore.

`val dummy_box : 'a box`

`dummy_box`

can be used for initialising structures like arrays. Note that if `unbox`

is called on a data structure containing `dummy_box`

, then the exception `Failure "Invalid use of dummy_box"`

is raised.

`binder_compose b f`

postcomposes the binder `b`

with the function `f`

. In the process, the binding structure is not changed. Note that this function is not always safe. Use it with care.

`mbinder_compose b f`

postcomposes the multiple binder `b`

with `f`

. This function is similar to `binder_compose`

, and it is not always safe.

`raw_binder name bind rank mkfree value`

builds a binder using the `value`

function as its definition. The parameter `name`

correspond to a preferred name of the bound variable, the boolean `bind`

indicates whether the bound variable occurs, and `rank`

gives the number of distinct free variables in the produced binder. The `mkfree`

function injecting variables in the type `'a`

of the domain of the binder must also be given. This function must be considered unsafe because it is the responsibility of the user to give the accurate value for `bind`

and `rank`

.

```
val raw_mbinder :
string array ->
bool array ->
int ->
('a var -> 'a) ->
('a array -> 'b) ->
('a, 'b) mbinder
```

`raw_mbinder names binds rank mk_free value`

is similar to `raw_binder`

, but it is applied to a multiple binder. As for `raw_binder`

, this function has to be considered unsafe because the user must enforce invariants.