Building Canrun: A statically typed logic programming library for Rust
Update 2 Jan, 2023: Many aspects of this library have been redesigned, especially relating to the domain stuff: Simplifying a toy logic programming library
Canrun is a new logic programming library for Rust with static types and constraints.
In my initial post I mentioned going through a few fundamental revisions before settling on the current approach. Here Iâll try to do a quick recap for two reasons: 1) I think itâs neat, and 2) with luck Iâll snag the attention of someone with deeper experience and a willingness to share some tips.
My first successful attempt was actually based on an article by Tom Stuart titled âHello, declarative worldâ1, which described a way to implement ÎŒKanren in Ruby2. I recommend reading that if you find yourself confused about any of the âwhyâ I gloss over here.
Starting Out
The core moving part is the State
. In its most basic form, it consists of a mapping between âlogic variablesâ (weâll call them LVar
) and âlogic valuesâ (or Val<T>
). Note that a Val<T>
can contain an LVar
or a resolved T
.
Note: These code samples are heavily abridged. Many aspects including lifetime annotations have been elided for the sake of clarity.
pub struct State<T> {
values: HashMap<LVar, Val<T>>,
}
pub enum Val<T> {
Var(LVar),
Resolved(Rc<T>),
}
The public API essentially provides a way to âattemptâ the addition of new bindings (through unification) and resolve existing bindings.
impl State<T> {
pub fn unify(&self, a: &Val<T>, b: &Val<T>) -> StateIter<T>;
pub fn resolve(&self, val: &LVar) -> Option<Val<T>>;
}
A key characteristic of this model is that .unify(...)
returns a new State
instead of mutating Self
.
The return type of .unify(...)
is an alias:
pub type StateIter<T> = Box<dyn Iterator<Item = State<T>>>;
A Goal
is a struct that simply returns an Iterator
of potential states (yes, I said âsimply returns an Iterator
â⊠whatâs so funny?).
pub enum Goal<T> {
Unify {a: Val<T>, b: Val<T>},
Either { a: Box<Goal<T>>, b: Box<Goal<T>> },
// ... truncated
}
impl Goal<T> {
fn run<T: CanT>(&self, state: &State<T>) -> StateIter<T> {
match self {
Unify {a, b} => {
Box::new(once(state.unify(a, b)))
},
Either {a, b} => {
let a_iter = a.run(state);
let b_iter = b.run(state);
Box::new(a_iter.interleave(b_iter))
}
// ... truncated
}
}
}
These goals can be combined arbitrarily to create more complicated goals.
let goal = Either {
a: Unify { a: x, b: 1},
b: Both {
a: Unify { a: x, b: 2 },
a: Unify { a: y, b: 1 },
},
};
A goal can return as many new states as needed. It worked! Well, eventually. After a small war of attrition against the borrow checker, I finally managed get the streams flowing. Many clones were sacrificed.
With a few helper functions and some trait driven type coercion, the API wasnât too bad!
let state = State::new();
let goal: Goal<i32> = either(
unify(x, 1),
both(unify(x, 2 ), unify(y, 1 )),
);
goal.run(state) // <- returns an iterator of potential states
Smooth Sailing?
I pushed forward, implementing the other types of goals and learning more about ownership, how and when to use enums vs trait objects and more. But I couldnât shake the feeling that I was fighting more than just the borrow checker.
On top of excessive cloning, I was also struggling with a lack of type safety. My initial plan was for the T
type parameter to be a user defined enum.
enum MyType {
Number(i32),
Word(String),
}
// This compiles, but can never succeed!
let goal: Goal<MyType> = unify(
MyType::Number(42),
MyType::Word("42"),
);
Essentially, all relations in the âlogic worldâ were dynamically typed.
I needed a new approach.
Rust is Different
An interesting thing about Rust is that it pulls so much from functional programming, but trying to use a âpure immutableâ style as in a typical garbage collected language can get really uncomfortable. This sort of pain is usually a sign that something needs to change.
My new mutable State
adds a .fork(...)
operation and tweaks .unify(...)
:
impl State<T> {
pub fn unify(mut self, a: &Val<T>, b: &Val<T>) -> Option<Self>;
pub fn fork(mut self, fork: Rc<dyn Fork<T>>) -> Option<Self>;
}
pub trait Fork<T> {
fn run(&self, state: State<T>) -> StateIter<T>;
}
Note that both functions take a mut self
and return an Option<Self>
. The .unify(...)
function eagerly attempts to reconcile bindings with those already contained in the state. If unification fails, the entire state is now invalid. We can bail right away with None
and avoid processing any additional updates. Used with the ?
try operator, this can actually be quite smooth.
State::new()
.unify(x, 1)? // <- returns Some(state)
.unify(x, 2)? // <- returns None
The simplest Goal
that would use Fork
is Either
, where youâll get zero or more results states for both sides:
let goal = either(
unify(x, 1),
unify(x, 2),
);
By deferring evaluation of the Fork
objects, we do as much work to disprove as many goals as possible before we eventually split into an arbitrary number of result states.
The Fork
traitâs .run(...)
is not invoked until we query for results. At that point, we recurse our way through the list of Fork
items as a queue, branching out at each iteration.
fn iter_forks(mut self) -> StateIter<'a, D> {
let fork = self.forks.pop_front();
match fork {
None => Box::new(once(self)),
Some(fork) => Box::new(fork.run(self).flat_map(State::iter_forks)),
}
}
Note that iter_forks(..)
does not clone the State
. The struct that implements Fork
may clone the State
at the last moment, but only if actually needed.
In a nutshell:
.unify(...)
eagerly converges, constraining theState
into zero or one possible outcomes..fork(...)
lazily diverges, splitting aState
into zero or more alternate possibilities.
Typing the Domain
Statically typed logic programming seems to be a small niche within the niche that is regular logic programming.3
After a walking down a few dead ends, I settled on an approached based on procedural macros.
domain! {
pub MyDomain {
i32,
String
}
}
The domain!
macro can create a struct with associated functions that enable it to store and retrieve collections of values compatible with a user defined domain by type4. The generated struct looks (roughly) like this:
pub struct MyDomain {
t0: HashMap<LVar<i32>, Val<i32>>,
t1: HashMap<LVar<String>, Val<String>>,
}
With pseudo-private5 accessors that can be used to retreive the inner containers:
impl<'a> DomainType<'a, i32> for MyDomain {
fn values_as_ref(&self) -> &HashMap<LVar<i32>, Val<i32>> {
&self.t0
}
fn values_as_mut(&mut self) -> &mut HashMap<LVar<i32>, Val<i32>> {
&mut self.t0
}
}
The State
struct is now parameterized with a Domain
type, to which it delegates management of the individual Val<T>
containers (some of this is actually spread into other traits/impls, but this is essentially what happens):
impl <D: Domain> State<D> {
pub fn resolve_val<T>(&self, val: &Val<T>) -> &Val<T>
where
D: DomainType<T>,
{
match val {
Val::Var(var) => {
let resolved = self.domain.values_as_ref().get(var);
match resolved {
// We found another Var, try to resolve deeper
Some(found) => self.resolve_val(found),
// We didn't find a binding, return the Var
None => val,
}
}
// This isn't a Var, just return it
value => value,
}
}
}
Goals are also parameterized by Domain
:
// Compiles!
let goal = Goal<MyDomain> = unify(x, 1);
// Does not compile :)
let goal = Goal<MyDomain> = unify(x, vec![1, 2]);
While the macro approach feels a bit idiosyncratic, it does have a few really nice properties. Most notably: explicitly defining all of the types that should be valid in the logic domain greatly increases the helpfulness of compiler errors.
Takeaways
- In Rust (as elsewhere), pain is an indication that you may want to reconsider your approach.
- Studying prior art is important, but take its original context into account to avoid imitating badly. A different context might require different choices.
- David Tolnayâs quote and syn libraries make my rather raw proc macro way nicer than it has any business being.
Onward!
As time allows, Iâll dig into constraints, the UnifyIn
/Query
/ReifyIn
traits, collections and more!
- Github: https://github.com/tgecho/canrun_rs
- Crate: https://crates.io/crates/canrun
- Docs: https://docs.rs/crate/canrun
Footnotes
-
Iâm sorry to say that despite The Reasoned Schemerâs status as the classic book on miniKanren, it never really clicked for me. Similarly, most of the other resources Iâve found on were either too deep or too toylike. This is not a critism or complaint! Logic programming is a niche, and Iâm very grateful to those who have shared their work. â©
-
I actually used TypeScript for some of the very earliest prototyping due to its familiarity and relative malleability. So I learned about a logic programming approach that originated in Lisp from a Rubyist, started coding in TypeScript and finally built the thing in Rust. âcause why not? â©
-
The most substantial prior art for statically typed logic programming I found was Mercury and OCanren. As is typical, I did not spend nearly enough time trying to glean insight before I set out on my own. â©
-
This is not quite the same as something like AnyMap, which depends on types having a
'static
lifetime for reasons I donât claim to fully understand. â© -
Macro generated code is odd. Since it exists inside the the userâs module, normal visibility tools do not really work for the macro author. My actual implementation of the
domain!
macro has some additional indirection that has a secondary effect of providing a bit of privacy. Ultimately I just had to slap#[doc(hidden)]
all over it and ask nicely to be left alone. â©