10 releases (4 breaking)

Uses old Rust 2015

0.5.0 Oct 13, 2018
0.4.0 Oct 1, 2018
0.3.2 Aug 21, 2018
0.2.3 Jul 29, 2018
0.1.0 Jul 22, 2018

#629 in Data structures

Apache-2.0

71KB
1.5K SLoC

Moniker

Build Status Crates.io Docs.rs Gitter

Moniker makes it simple to track variables across nested scopes in programming language implementations.

Instead of implementing name-handling code by hand (which is often tedious and error-prone), Moniker provides a number of types and traits for declaratively describing name binding directly in your language's abstract syntax tree. From this we can derive the corresponding name-handling code automatically!

Motivation

It's interesting to note that the idea of a 'scope' comes up quite often in programming:

Description Rust Example
case match arms match expr { x => /* `x` bound here */ }
let bindings let x = ...; /* `x` bound here */
recursive functions fn foo() { /* `foo` bound here */ }
functions parameters fn foo(x: T) { /* `x` bound here */ }
recursive types enum List { Nil, /* `List` bound here */
type parameters struct Point<T> { /* `T` bound here */ }

For example, let's take a silly example of a Rust function:

type Count = u32;

fn silly<T>((count, data): (Count, T)) -> T {
    match count {
        0 => data,
        count => silly((count - 1, data)),
    }
}

There's actually lots of name-binding at play here! Let's connect the binders to their corresponding binders:

//            ?
//            |
//            v
type Count = u32;
//     |
//     '----------------------.
//  .-------------------------|------------------.
//  |    .--------------------|----*------.      |
//  |    |                    |    |      |      |
//  |    |                    v    v      v      |
fn silly<T>((count, data): (Count, T)) -> T { // |
//             |      |                          |
//          .--'      |                          |
//          |         *--------------.           |
//          v         |              |           |
    match count { //  |              |           |
//             .------'              |           |
//             |                     |           |
//             v                     |           |
        0 => data, //                |           |
//         .--------------.          |           |
//         |              |          |           |
//         |              v          v           |
        count => silly((count - 1, data)), //    |
//                 ^                             |
//                 |                             |
//                 '-----------------------------'
    }
}

Keeping track of the relationships between these variables can be a pain, and can become especially error-prone when implementing evaluators and type checkers. Moniker aims to support all of these binding structures, with minimal pain!

Example

Here is how we would use Moniker to describe a very small functional language, with variables, anonymous functions, applications, and let bindings:

#[macro_use]
extern crate moniker;

use moniker::{Embed, Binder, Rec, Scope, Var};
use std::rc::Rc;

/// Expressions
///
/// ```text
/// e ::= x               variables
///     | \x => e         anonymous functions
///     | e₁ e₂           function application
/// ````
#[derive(Debug, Clone, BoundTerm)]
//                        ^
//                        |
//              The derived `BoundTerm` implementation
//              does all of the heavy-lifting!
pub enum Expr {
    /// Variables
    Var(Var<String>),
    /// Anonymous functions (ie. lambda expressions)
    Lam(Scope<Binder<String>, RcExpr>),
    /// Function applications
    App(RcExpr, RcExpr),
}

pub type RcExpr = Rc<Expr>;

We can now construct lambda expressions by doing the following:

let f = FreeVar::fresh(Some("f".to_string()));
let x = FreeVar::fresh(Some("x".to_string()));

// \f => \x => f x
Rc::new(Expr::Lam(Scope::new(
    Binder(f.clone()),
    Rc::new(Expr::Lam(Scope::new(
        Binder(x.clone()),
        Rc::new(Expr::App(
            Rc::new(Expr::Var(Var::Free(f.clone()))),
            Rc::new(Expr::Var(Var::Free(x.clone()))),
        )),
    )))
)))

More Complete Examples

More complete examples, including evaluators and type checkers, can be found under the moniker/examples directory.

Example Name Description
lc untyped lambda calculus
lc_let untyped lambda calculus with nested let bindings
lc_letrec untyped lambda calculus with mutually recursive bindings
lc_multi untyped lambda calculus with multi-binders
stlc simply typed lambda calculus with literals
stlc_data simply typed lambda calculus with records, variants, literals, and pattern matching
stlc_data_isorec simply typed lambda calculus with records, variants, literals, pattern matching, and iso-recursive types

Projects using Moniker

Moniker is currently used in the following Rust projects:

  • Pikelet: A dependently typed systems programming language

Overview of traits and data types

We separate data types into terms and patterns:

Terms

Terms are data types that implement the BoundTerm trait.

Implementations for tuples, strings, numbers, slices, vectors, and mart pointers are also provided for convenience.

Patterns

Patterns are data types that implement the BoundPattern trait.

Implementations for tuples, strings, numbers, slices, vectors, and mart pointers are also provided for convenience.

Roadmap

Moniker is currently good enough to use for initial language prototypes, but there is more work that we'd like to do to make it a more fully-featured name binding and scope handling toolkit.

  • Initial implementation using a locally nameless representation
    • Implement basic type combinators
      • Embed
      • Ignore
      • Nest
      • Rec
      • Scope
    • Automatically derive traits
      • BoundTerm
      • BoundPattern
      • Subst
    • Allow derives to use identifier types other than String
    • Implement namespaced variables and binders
    • Performance optimizations
      • Cache max-depth of terms
      • Cache free variables of terms
      • Perform multiple-opening/closing
      • Use visitors
  • Explore implementing other name-binding schemes
    • Named with unique indices
    • Scope Graphs
    • ...?

References

Here is a list of interesting references and prior art that was helpful when building Moniker. Note that it isn't necessary to read and understand these to use the library, but they might be useful if you would like to contribute!

Papers

Blog Posts

Other Libraries

The API was mainly inspired by the Unbound and Unbound-Generics libraries for Haskell, with some differences. The main change that we make is to have two separate traits (BoundTerm and BoundPattern) in place of Unbound's single Alpha type class. We've found that this better captures the semantics of the library, and greatly cuts down on the potential for accidental misuse.

Other auto-binding libraries exist for a number of different languages:

  • Unbound: Specify the binding structure of your data type with an expressive set of type combinators, and Unbound handles the rest! Automatically derives alpha-equivalence, free variable calculation, capture-avoiding substitution, and more.
  • Unbound-Generics: an independent re-implementation of Unbound but using GHC.Generics instead of RepLib.
  • Cαml: a tool that turns a so-called "binding specification" into an OCaml compilation unit.
  • alphaLib: An OCaml library that helps deal with binding constructs in abstract syntax trees.
  • abbot: Generation of abstract binding trees for SML
  • rabbot: A port of SML's Abbot to Rust
  • DBLib: Facilities for working with de Bruijn indices in Coq
  • Bound: DeBruijn indices for Haskell
  • Metalib: The Penn Locally Nameless Metatheory Library
  • LibLN: Locally nameless representation with cofinite quantification

Contributing

We really want to encourage new contributors to help out! Please come chat with us on our Gitter channel - if you have any questions about the project, or just want to say hi!

Acknowledgments

YesLogic Logo

This work was done in part with the generous support of YesLogic.

Dependencies

~2.5MB
~56K SLoC