One of the key goals of the Bosque project is to create a language, and development stack, where formal methods can be widely and effectively applied at scale. This goal is to be agnostic about the form of underlying reasoning, from simple rule based linting, through abstract-interpretation based tools, to full logical reasoning systems, all of them should be easier to build and more effective/practical in the Bosque ecosystem.

However, first-order logic based reasoning is one of the oldest, and theoretically most powerful approaches to formal analysis and verification (or validation) of software systems. In this context, building a first-order logic based framework for reasoning about Bosque code is a key component of the project. A practical limit to this type of system is the fundamental undecidability of the problem in full generality! It may be interesting in the future to explore a fully-verified software system story for Bosque, using proof techniques similar to Dafny or F*. However, for most scenarios and developers, an incomplete but simple and fully automated formalism is more useful.

The most well-know and (sometimes) used in practice approaches to incomplete symbolic reasoning about program behavior are based on symbolic model checking. Instead of proving that a given property holds for all possible executions of a program these techniques symbolically (or concretely) unroll and explore possible paths in the program until they find a violation of the property or exhaust the search resources – cbmc and similar tools for Rust (Kani) or Java (JBMC). A challenge in this approach is how to explore possible paths in programs – particularly as in loops or recursive functions where the path is not syntactically bounded. In these cases the checker can either try to unroll the loop (expand the call) another time or decide to exit and explore deeper into the code. In some cases only one option is possible but in general the both choices may be possible.

As a result the formal guarantee that a model checker makes on the existence of an error is that it will find one if it exists and it is reachable within a bounded number of unrollings (or exploration time). This is useful but, on some level, unsatisfying as there may be small/simple inputs that cause the program to enter a loop that is too deep to explore in the time budget and will be missed.

This is an interesting issue and I am trying to formulate more precisely what/why this is a problem. So, let me share an experimental viewpoint on the topic. Conceptually, if we view the program input space as being a space that can be partitioned into a finite cover then a code unrolling based exploration does not guarantee the coverage of any particular partition of the input space. Instead each run covers some part of the execution trace but, as the unrollings are unbounded, the trace coverage will not be complete (in some cases the use of interpolants enable completion but this is not guaranteed). Conversely, if we believe the input equivalence partition hypothesis then we can say that if we verify the absence of errors over each input partition then we can guarantee the absence of errors for all inputs – and one particular input partition is widely hypothesized to be related to the vast majority of errors.

The small-model hypothesis or small-repro requirement when filling a bug report is the principle that any bug can be reproduced with a “reasonably” simple set of steps. In practice the number of issues that satisfy this requirement is very high. Thus, taking the compactness viewpoint, a high value input partition to cover is the set of small inputs! With this perspective we can see that the inability of model checking to reliably cover this input partition, instead of all small execution unrollings, leads to low coverage of simple (and high likelyhood) bugs.

Bosque presents a unique opportunity as the source language is specifically designed to translate 1-1 into (mostly) efficiently decidable SAT Modulo-Theory Formula. For example addition, x + 5 => the theory of integers and linear arithmatic while objects go to the theory of constructors. The major exception to this mapping are collections, List<T> or Map<K,V> and iterative operations on them, in general these language features do not (and theoretically cannot) map to any decidable theory. Instead, for Bosque, we take an approach inspired by the concept of compactness and the small-model hypothesis. Specifically, while we may not be able identify to cover all input partitions we do know there is one that is particularly important – the set of small inputs!

With this observation we can then solve the problem of how to encode collections by using the theory of Sequences and providing a finite bound on their input sizes (but not on the any sizes for those created internally). This allows us to convert an arbitrary Bosque program into a decidable SAT Modulo-Theory formula with the property that – if there is an error that can be triggered with a small-reproduction (small model) then it can be found and, if the solver returns unsat, then the small model input partition is fully covered and we are guaranteed that no such error exists. This is a powerful guarantee and one that is not possible in general for model checking tools.

This notion of a compactly solvable language – one that provides a decidable theory for the problem of small input partition errors, as described here, or more generally program equivalence – is a powerful concept. It can be used for checking semantic versioning, mock or test generation, and more. Further, it can be used as the basis for a trivial semi-decision procedure (via incrementally increasing the bound on input size) and theoretically opens the possibility of more comprehensive proof procedure if we are able to develop a suitable theory for input space partitioning and compactness.