Quickstart

Coma (which stands for Continuation Machine) is an Intermediate Verification Language implemented as a plugin in the Why3 platform.

Continuation-passing style allows us to devise an extremely economical abstract syntax for a generic verification language. This syntax is flexible enough to naturally express conditionals, loops, (higher-order) function calls, and exception handling. It is type-agnostic and state-agnostic, which means that we can combine it with a wide range of type and effect systems.

Shiele — Four Trees

Shiele — Four Trees

Coma language

Top level declarations in Coma files can be of two kinds:

  • Logical declarations introduce data types, logical functions and predicates, as well as axioms and lemmas. These declarations are used to specify the Coma programs. Their syntax and semantics are precisely those of WhyML.
  • Program declarations introduce Coma handlers which perform stateful computations and whose correctness we prove when we verify a Coma program.

Logical declarations

Coma admits WhyML logical declarations such as

  • Module imports

    use int.Int
    use list.List as L
    
  • Types

    • abstract types
      type char
      
    • aliases
      type word = seq char
      
    • algebraic data types
      type regexp =
      | Empty
      | Epsilon
      | Char    char
      | Alt     regexp regexp
      | Concat  regexp regexp
      | Star    regexp
      
  • Pure functions and predicates

    function sum (f: int -> int) (a b: int): int
      variant { b - a }
    = if b <= a then 0 else sum f a (b - 1) + f (b - 1)
    
  • Inductive predicates

    inductive mem (w: word) (r: regexp) =
      | mem_eps:  mem empty Epsilon
      | mem_char: forall c: char. mem (singleton c) (Char c)
      | …
    
  • Lemmas and axioms

    lemma sum_left:
      forall f: int -> int, a b: int.
      a < b -> sum f a b = f a + sum f (a + 1) b
    

Coma Programs

Coma programs are executable expressions we want to specify and prove.

Coma Syntax

A Coma program is a collection of handlers.

A top level handler definition is of the form: let <prototype> = <body>. A handler can be recursive using the let rec binding and multiple top-level handlers can be mutually recursive using

let rec <prototype> = <body>
with <prototype> = <body>
with <prototype> = <body>

As a first example, let's look at the definition of the cons handler.

let cons < 'a >  (h: 'a)  (t: list 'a)  (ret (l: list 'a)) = ret {Cons h t}
    ─┬── ─┬────  ─────┬─  ─────┬──────  ───┬────────────┬─   ─┬────────────
     │    │           │        │           │            │     │
    name  │           │        │   outcome parameter    │     │
     │    └ type      │        │                        │     │
     │      parameter │        │                        │     │
     │                └ term parameters                 │     │
     │                                                  │     │
     └───┬──────────────────────────────────────────────┘     │
         └ <prototype>                                        └ <body>

The prototype follows the grammar:

<prototype> ::= h π
                ┬ ┬
                │ │
                │ └ signature
                │
                └ handler name

          π ::= < 'a >* (x: τ)* (&x: τ)* { t }* <prototype>*
                ──┬───  ──┬───  ───┬───  ──┬──  ──┬────────
                  │       │        │       │      │
   type parameter ┘       │        │       │      │
                          │        │       │      │
           term parameter ┘        │       │      │
                                   │       │      │
               reference parameter ┘       │      │
                                           │      │
                              precondition ┘      │
                                                  │
                                handler parameter ┘

where τ is a term type:

  τ ::= 'a                               type variable
      | int | bool | …                   basic type
      | list τ                           type application
      | …

The body is a Coma expression (e), which follows the grammar:

  t ::= WhyML terms¹

  e ::= h                                  handler
      | e < t >                            application to a type `t`
      | e { t }                            application to a term `t`
      | e &r                               application to a reference `r`
      | e₁ e₂                              application to an expression
      | (! e)                              abstraction barrier
      |  { t }  e                          assertion
      | -{ t }- e                          assumption
      | fun π -> e                         anonymous handler
      | (-> e)                             anonymous nullary handler
      | e [ h π₁ =  e₁ | g π₂ =  e₂ ]      group of mutually recursive definitions²
      | e [ h π₁ -> e₁ | g π₂ -> e₂ ]      group of nonrecursive definitions
      | e [ x:  τ = t ]                    `let x = t in e`     (let binding)
      | e [ &x: τ = t ]                    reference allocation
      | [ &x <- t ] e                      reference modification
      | e₁ . e₂                            syntactic sugar for `e₁ (e₂)`

¹Check the syntax of WhyML terms.

²The user can read “e where h π₁ = e₁ and g π₂ = e₂ are mutually recursive handlers

Remarks

  1. the terms must be written inside curly braces { x+2 }, { sorted l } (with the exception of those in let and set)

  2. the let and set can be done multiple times in one with the syntax

        e [ x: τ = 34 | y: τ = 55 ]            parallel let-binding
        e [ &x <-  89 | &y <- 144 ]            parallel set
    

Examples

Many examples can be found in https://gitlab.inria.fr/why3/why3/-/blob/coma-only/examples/coma/.

Program definition

  • Top-level handler definition:

    • two parameters x and y, single continuation ret
      let add (x y: int) (return (z: int)) = return {x+y}
      
    • same example using mutable state (only z is mutable)
      let add (x y &z: int) (return) = [&z <- x+y] return
      
  • Handler declaration (specification without implementation)

    let unList < 'a > (l: list 'a)
      (onCons (h: 'a) (t: list 'a) {l = Cons h t})
      (onNil {l = Nil})
    = any
    
  • Element insertion in a sorted list

    let rec insert (x: int) (l: list int) {sorted l}
                   (return  (r: list int) {sorted r && permut r (Cons x l)})
    = unList <int> {l}
        (fun (h: int) (t: list int) ->
          if {x < h} (-> return {Cons x l}) .
          insert {x} {t} . fun (r: list int) ->
          return {Cons h r})
        (return {Cons x Nil})
    

Coma specification

In Coma, the abstraction barrier is flexible.

The barrier

In a function definition, the abstraction barrier divides the part that is visible to the caller from the implementation part which is hidden from the caller.

In traditional contract-based languages, this barrier is implicitly set at the level of the = symbol, for example in WhyML:

  let rec fact n
    requires { n >= 0 }
    ensures  { result = n! }
  = if n = 0 then 1 else n * fact (n-1)
   🯊
there is an implicit abstraction barrier here

Here the caller only sees the specification, but not the implementation.

Later in the code:

  (* here we have to prove the precondition (55 >= 0) *)
  let f55 = fact 55 in
  (* here we learn the postcondition (f55 = 55!) *)

Examples in Coma

If you want the traditional barrier and specification

let rec fact (n: int) { n >= 0 } (return (r: int) { r = n! })
                      ─────┬────                  ────┬─────
                           │                precondition of `return`
    precondition of `fact` ┘              ≡ postcondition of `fact`
= if {n = 0} (-> return {1})
             (-> fact {n-1} (fun (r: int) -> return {n*r})

Remark: the latter is syntactic sugar for

let rec fact (n: int) (return (r: int))
= { n >= 0 }
  (! if {n = 0} (-> out {1})
                (-> fact {n-1} (fun (r: int) -> out {n*r}))
  [ out (res: int) -> { r = n! } (! return {res})]

where the precondition is an assertion over a abstraction barrier which hides the implementation and the postcondition is put in a wrapper over the call of the outcome.

Or else, you can remove the assertion without touching the barrier

let rec fact (n: int) {} (return (r: int) { r = n! })
= if {n = 0} (-> return {1})
             (-> fact {n-1} (fun (r: int) -> return {n*r})

Here again, the body is hidden (which is what we want since the handler is recursive). It is the same as before but with a precondition equivalent to true.

Or else, you can push the barrier inside the code

let rec fact (n: int) (return (r: int) { r = n! })
= if {n = 0} (-> return {1})
             (-> { n > 0 } (! fact {n-1} (fun (r: int) -> return {n*r}))
                 ^^^^^^^^^ ^^                                         ^

The precondition will only have to be proved in the else case. The then case is inlined (from the verification point of view).

Or else, do not put the barrier at all, which will lead to a fully inlined handler

Consider the rotate_left handler, which one can use for balancing a binary tree.

let rotate_left (t: tree) (return (r: tree))
= unTree {t} (fun (a:tree) (x:elt) (r:tree) ->
  unTree {r} (fun (b:tree) (y:elt) (c:tree) ->
  return {node (node a x b) y c})
  fail) fail

This is a good example where the specification would be the same as the code. Thus we prefer not to put a barrier, which has the effect of fully inlining the handler at the call position.

Notice that a barrier is mandatory in the definition of fact because it is recursive. Any recursive call must be protected by a barrier.

Handler declaration

We can declare and specify a handler without implementing it.

let if (b: bool) (then {b}) (else {not b}) = any

Here {b} (resp. {not b}) is the postcondition of if in the then (resp. else) branch. This is syntactic sugar for:

let if (b: bool) (then) (else) =
  any
  [ tt -> {b}     (! then)
  | ff -> {not b} (! else) ]

Coma standard library

The standard library of Coma must be explicitly loaded with use coma.Std.

It includes basic handlers such as:

  • halt: no parameters and no VC, can be used to end the program as the “final continuation”
  • fail: no parameters but false as precondition, can be used as an assert false
  • if: has the prototype (b: bool) (then {b}) (else {not b})

Coma as an intermediate verification language

Coma is designed as an intermediate language for the purposes of verification conditions generation. Indeed, minimality and simplicity make Coma a good IVL candidate.

Currently it is used as a backend language for verification of Rust programs in the Creusot tool.

We are also working on verifying x86-64 assembler programs using Coma.

Rust Verification with Creusot

Creusot is a deductive verifier for Rust code. It verifies your code is safe from panics, overflows, and assertion failures. By adding annotations you can take it further and verify your code does the correct thing. Creusot works by translating Rust code to Coma.

x86-64 assembly

Micro-x86-64

Micro-x86-64 is a valid subset of x86-64. Hence, micro-x86-64 files can be passed to as.

Syntax of micro-x86-64

TODO

Specification

TODO

spec ::=  "#@" "assume"     term
        | "#@" "assert"     term
        | "#@" "invariant"  term

+ Multiline comments: "/*@" "assume" term "*/", …

term ::= identifier
       | "true" | "false"
       | 1 | 2 | …
       | rax | eax | … | rbx | …
       | term "binop" term
       | "let" identifier "=" term "in" term
       | "if" term "then" term "else term

binop ::=
        | "/\" | "&&"
        | "\/" | "||"
        | "->"
        | "=" | "<" | ">" | ">=" | "<="
        | "+" | "-" | "*" | "/"

Note that you must explicitly use the mach.asm.X86 module in your code (which should be the case once you actually prove things, but not necessarily when you initially set up a project), such as with the line:

#@ use mach.asm.X86

or you will get a compilation error complaining that the mach.asm.X86 module is not loaded.

Example

#@ use int.Int
#@ use mach.asm.X86

    .text
    .globl  isqrt

isqrt:
#@ assume 0 < rdi < i32_max
        xorq    %rax, %rax
        movq    $1,  %rcx
        jmp     T1
L1:     incq    %rax
        leaq    1(%rcx, %rax, 2), %rcx
T1:
#@ invariant 0 <= rax
#@ invariant rax * rax <= rdi < i32_max
#@ invariant rcx = (rax + 1) * (rax + 1)
        cmpq    %rdi, %rcx
        jle     L1
#@ assert let a = rax in a * a <= rdi < (a+1)*(a+1)
        ret

Limitations

TODO

Contributors

Here is a list of the contributors who have helped improving Coma.

  • Andrei Paskevich
  • Jean-Christophe Filliâtre
  • Paul Patault
  • Xavier Denis