Notes
This course is about writing formal specifications, and the techniques we can use to improve
the reliability of our code using those specifications. This topic, broadly, is called
formal methods. The premise of all formal methods is to develop a precise formal description of how
a piece of software is supposed to work. This description is called the
specification, and the desire to make it precise and unambiguous leads
logic to be a very natural fit for it, since logical statements have a
completely unambiguous meaning. For example, A \lor B is true if A is
true and also true if B is true —
Specifications need not be complete —
; fastsquare : Integer -> Integer ; Squares the input number
I could give it both of the following specifications, using \forall (universal quantification) from first-order logic:
\forall~ x : \text{Integer}.~ \mathtt{fastsquare}~ x >= 0
\forall~ x : \text{Integer}.~ \mathtt{fastsquare}~ x = x~ *~ x
Both use a universal quantification from first-order logic, which says that for any integer x, the given logical statement should hold. What do they each say? The first says that the result of calling fastsquare should be at least 0, though says nothing about what the actual result should be. This is true, but also admits many functions that likely would not satisfy us: for example, a constant function that always returned 0. The second specification is better: it says that the result of running fast-square on x should result in x * x. This is a statement of correctness: it specifies exactly what the result should be, but makes no statement about how the fastsquare function should get to that result. Perhaps our implementation can accomplish it with bit shifting, relying on knowledge about the underlying representation, etc. This may be enough for some purposes, but for others, we might need still other specifications. For example, if we were writing high-security code, we might want to ensure properties that relate to information leakage.
Once we have a good enough specification of what a program should do, we need to use that specification to increase our trust of our code. Note that I do not say that we prove our code correct, as trust is always relative, and the goal can essentially never be absolute trust: but if we want to put the effort in, we can get things to be good enough that bugs are vanishingly unlikely. This is the point where the field of formal methods splits out into many different directions, some fundamental, and some based on tools.
In this course, we will cover three fundamental approaches: first, we will cover probably the highest efficacy to weight formal methods technique, property based testing (PBT). Next, we’ll see how SAT/SMT solvers can be used to scale testing approaches to very large input spaces and solve classes of problems automatically. Lastly, we will see how to use the theorem prover Lean to prove that the code satisfies very expressive specifications. Provided there aren’t bugs in the Lean system, these proofs are then highly trustworthy, but constructing proofs is both difficult and requires specialized tools, so is a more specialized technique than SMT, and certainly than property based testing, which can be easily used in almost any setting.
To begin however, we need a foundation: propositional logic.