21Kloc Later

Since the last blogpost the focus has been on heads down using Bosque to implement Bosque features. Since then, we have written approximately 21Kloc of Bosque code(!!), consisting of several compiler pass features and emitters.

Creating a set of Intermediate Representations (IR) for Bosque code that is amenable to lightweight static analysis or LLM support.
A translation of Bosque code into SMTLib for logical analysis – both formal validation and testing applications.
A translation of Bosque code into C++ as part of our research into next generation language runtime systems.

A key question in this phase of work was – is Bosque an effective and usable language at scale. After 21+Kloc we can say pretty confidently say yes!

A Few Bumps

Not surprisingly there is some learning curve to get into the mindset of writing Bosque code but this is a pretty quick and small hill to climb. We also found some patterns with rough edges as we wrote larger amounts of code. One of these was the need to deal with handling contexts while performing iterative processing and the other (related) issue accumulation operations.

let octx, conds = stmt.condflow.reduce<(|SMTTransformerCtx, SMTAssembly::Operation|)>((|ectx, elseop@some|), fn(acc, flow) => {
    let ntexp, cctx, cobinds = this.processUnaryArg(flow.0, acc.0);
    let nctx, nblock = this.transformBlockStatement(flow.1, tail, cctx);

    let ifop = SMTAssembly::IfThenElseOperation{ntexp, nblock@some, acc.1};
    if(cobinds)@none {
        return nctx, ifop;
    }
    else {
        return nctx, SMTAssembly::LetWErrorsOperation{$cobinds, ifop};
    }
});

This code highlights the issue as the reduce needs to explicitly and manually push the context ectx through each iteration as the first element of the accumulator tuple. Our next roadmap include completing support for ref arguments which would allow us to rewrite this as:

let octx, conds = stmt.condflow.reduce<SMTAssembly::Operation>(ref ectx, elseop@some, fn(acc, flow) => {
    let ntexp, cobinds = this.processUnaryArg(flow.0, ref ectx);
    let nctx, nblock = this.transformBlockStatement(flow.1, tail, ref ectx);

    let ifop = SMTAssembly::IfThenElseOperation{ntexp, nblock@some, acc};
    if(cobinds)@none {
        return ifop;
    }
    else {
        return SMTAssembly::LetWErrorsOperation{$cobinds, ifop};
    }
});

In this version the ref parameter (currently partial support) and keyword instructs the compiler to automatically thread (monadically if you prefer) the updates and uses of the ectx variable through each iteration of the reduce function.

In this example using reduce is the preferred approach as there is a true linear dependence between processing each statement in a sequence. However, there are many algorithms where forced sequentialization obscures the intent of the computation (and prohibits optimization).

This requires some new work in Bosque to provide support for these patterns. The current plan is to add new restriction operations for parametric type T that indicate what types of merge operations can be performed on the type. This is currently used as several specialized methods for maps:

function mergeIdempotent(...maps: List<Map<K, V>>) { ... }

function mergeAllConfluent(maps: List<Map<K, V>>, merge: fn(V, V) -> V): Map<K, V> { ... }

In the first version it is assumed that if the keys are equal then the values are as well. In the second case, the merge function is assumed to be associative and commutative (and thus values can be combined in any order). As currently implemented there is a large amount of picking methods and lambda writing. By adding a template parameter restriction that ensures these properties we can implicitly handle these optimized merges and dispatch automatically.

This idea is not altogether new. Adding annotations to support parallelism has been examined multiple times. The issue comes down to bugs and subtle failures to satisfy the needed commutativity, independence, etc. properties. This, of course, leads to subtle bugs as in the case of special implementations of order or equality.

However, Bosque has a superpower for exactly this type of situation! Bosque code (or at least smaller methods) can be statically checked to determine if a given property is satisfied. In general this is used for checking for possible runtime failures and user provided assertions. However, in special cases like this, we can write a specialized checker that verifies if user defined implementations of merge are in-fact confluent!

We define confluence as the property that for all inputs a, b, and c, the following holds – from a value lattice perspective:

merge(a, merge(b, c)) == merge(merge(a, b), c)
merge(a, b) == merge(b, a)
a < merge(a, b) /\ b < merge(a, b)

More about this later!

Roadmap

Lots of neat stuff is inbound! Cool AoT and GC system is being finished up, and amazing test-generation capabilities are on the way! Paper deadline crunch!

I’ll be updating the Bosque landing page + issue tracker with work items but the focuses will be on cleaning up these issues, some misc. language improvements, improvements to the SMT solver, and adding API specs to match the Data Specs. From this we should be able to launch some neat testing, SemVer, and Agentic AI features!