Proofcheckers can be used to verify the correctness of proofs, which - in turn - can prove the correctness of programs. In theory. In practice, there is a slight problem: All of the modern proofcheckers are implemented in some high-level functional language, and their runtimes tend to be rather complicated. And they are not verified themselves.
There is the common concept of the "trusted codebase": You have to start with something, and the trusted codebase is the amount of program code you have to trust. Unfortunately, trying to keep the trusted codebase small while keeping proofs simple are contrary goals.
Coq addresses this problem by having a small kernel that checks the proofs, and lots of unverified code that produces proofs. This way, proving even complicated theorems becomes bearable. However, this can also result in regressions with newer versions of Coq: Tactics might do things differently in different versions, and automatic variable names might change. The common solution is to try to write proofs that do not depend on those names, or at least explicitly give these names. It is, however, not always possible. In theory, it is possible to export the proofs into a common format which is (mostly) version independent, but doing that is like saving compiled libraries to prevent compiler incompatibilities: Specifications tend to change, and proofs should therefore be maintainable, and maintainable proofs can always be ported to newer versions of Coq.
The situation is not perfect, but at least, no false proofs are
accepted - mostly. There have been bugs that allow deriving False
,
and therefore everything else. This is a common criticism of program
verification - how trustworthy can an unsound proofchecker be? But
think about what we are actually talking about: People have written an
algorithm, have thought about an informal proof ("proof in words")
that would otherwise be accepted, and then formalized it for a
proofchecker that has a bug which is usually some exploit of an edge
case. I think a scenario where this actually leads to a bug in the
resulting program code is highly unlikely. I wonder if something
like that ever even occured. I think, in conclusion one can say, a
verified implementation is usually formally correct, and it is almost
impossible for it to not satisfy the specification. And, above all,
you do not lose anything: You will have to write code anyway, the
process of verifying it will not introduce additional bugs.
That being said, it is – of course – very desirable to have a correct proofchecker. Ideally, it should be metacircular. To make it less prone to bugs, it is desirable to keep it as simple as possible. I tried to implement a simple proofchecker several times, but I always failed because at some point I didn't find it sufficiently simple anymore. In a functional language with algebraic types and automatic garbage collection, it might not be too hard — but also not elementary. OCaml, Standard ML and Haskell have sophisticated runtimes which might be a good thing when writing unverified high-level code, but for my taste, it is rather complicated for a proof checker. So I was thinking about what is needed to write a realistic proof checker.
I think, in the current situation, it is unrealistic to go to a lower abstraction level than C: C is the lingua franca. However, it is probably a good idea not to use C as a "portable assembler", but rather have a clean and obvious programming style rather than focussing on efficiency.
One reasonable way of encoding the several recursive definitions is
using structs, unions and enums. It is probably the easiest way, but
probably not the most efficient representation, but it should be
portable and the resulting code should be readable. However, this
results in complicated memory management. Any implementation would
probably have lots of allocations of small structs that point to each
other. Allocating objects can be done with malloc(3)
, but usually,
malloc(3)
is not meant to manage many small objects. A slice
allocator on top of malloc(3)
would solve that problem. But then
again, malloc(3)
bases on mmap(2)
. It would also make the
implementation more complex.
Allocating and deallocating objects can be done directly in the code,
but as this is done frequently, it is probably not a good idea to
always do this explicitly. A copying garbage collector would probably
be the best for this situation, and a simple copying garbage collector
is probably bearable – however, it would increase the trusted
codebase. Using a reference counter would also be sufficient, and one
could simplify it a lot using __attribute__((__cleanup__()))
from
GCC. Still, all of those possibilities – including manual memory
management – are rather complicated
The question is whether deallocating objects at all is necessary. As
The Old New Thing points out, the null
garbage collector is
correct. A proof checker gets axioms and already proved theorems, a
conclusion and a proof as input, and its only objective is to check
whether this proof is correct, and maybe return an error message that
indicates the error. A simple stack-oriented language that is sent
through stdio
should be sufficient, and therefore, it might be
sufficient to never actually deallocate. This would be the UNIX way,
probably: Do one thing and do it right. If it occurs that not enough
memory is available for some proof, one could preload libgc.
At some point, the question is whether all of this is worthwile at
all. One could just use C++ and std::shared_ptr
and be happy. But
then, the additional complexity of the C++ runtime arises. Well,
LEAN appears to be written in C++, but I don't know about its
memory management.
To this point, it is questionable whether it is really worthwile to try to write a typechecker only using low-level programming techniques, rather than relying on well-tested abstractions. Besides the already mentioned small likelihood of a critical bug in typecheckers that leads to wrong proofs of algorithms, even in the presence of such a bug, it would only have to be fixed once, so after 30 years without critical bugs, one could be pretty sure that everything works fine.
Of course, this is not the only reason for wanting an implementation of a proof checker in C. All the Systems appear to try to create an own programming language. I would rather like to have a library which I can use in the language of my choice. On the other hand, Coq has a way of calling external programs to provide proofs, which is in many cases equivalent to such a library.