### How to publish research accompanied by mechanically-checkable proofs

I've both written and reviewed research papers that incorporate mechanically-checked proofs (either as the main subject of the research or as supporting the correctness of other contributions). Computer-aided formalization and proof are increasingly part of the mainstream of (the formal side of) programming languages research, and seem to be being especially enthusiastically adopted by students and junior researchers. However, just having mechanically verified a proof of (some aspect of) a system is

**not enough**- all of the standard considerations regarding the reporting of scientific research, on-paper proofs, or research code still apply.

This post collects some guidelines that I hope will be helpful for researchers working on papers accompanied by formal proofs. I would have thought most of these were obvious, but have encountered enough counterexamples to disabuse me of this naive assumption. The following is probably not an exhaustive list, but is offered as a starting point.

### Make your formalization easy to verify

Assume the reader or reviewer will check your proofs (using the same tool used to produce them). That was the point of making them available, right?**Submit your formalization with the paper, or in an easy to access place.**This one really does go without saying, and I haven't seen anyone forget to do this, but I thought I'd mention this in the spirit of making things as obvious as possible.

**Include a README file**listing

*at least*the names of all salient proof files, their contents, and the version of the system you used to generate (and that can be used to check) the proofs. Most of the theorem provers are active research projects, and can change significantly between releases. So, stating the version used is necessary for reproducibility, and likely helpful as a memory aid later if reviewers have questions that you have to answer quickly. If you want to be especially nice to reviewers, include a command line (or shell script) that can be run to check the proofs automatically, assuming an appropriate version of the relevant system is in the environment. If your proofs build on someone else's, state these dependencies and give links.

### Make your formalization easy to read

Formal proof scripts are, by their nature, sometimes opaque or incomprehensible. Some systems, such as Coq tactics or Isabelle apply-scripts, essentially result in traces of imperative manipulations of the proof state, so are nearly impossible to read without re-executing the proofs interactively Other systems, such as Isabelle's ISAR, provide a proof language that more closely resembles mathematical writing, but putting proofs into this form also takes a bit more effort. In any case, it is generally safe to assume that readers will mostly focus on the statements of lemmas and the structure of proofs, so make this explicit - comment, format and organize the proofs much as you would code or technical material in a paper to make it easier to read and understand. In Isabelle apply-scripts, for example, I tend to insert comments in longer inductive proofs whenever a new case is started. This not only makes it easy to see which part of the proof handles each case, but also helps diagnose problems and adapt the proof (e.g. due to a change in an earlier definition).**Use syntactic flexibility judiciously.**Some systems, particularly Isabelle and Agda, now support very flexible parsing and pretty-printing of tokens (including Unicode symbols and mixfix operations taking multiple arguments). This can lead to elegant, easy-to-read formulas and it can also lead to gobbledegook. My experience is generally that a little goes a long way - a sensible rule of thumb is to introduce special notation for the 5-10 most frequently used relations or operations (e.g. those featuring prominently in the paper) and use mnemonic names for everything else. If you feel that you must have 15 different typing relations, include a table somewhere (e.g. in the README) mapping the notations to their informal meanings and the file in which their definition is found.

### Don't oversell the proof

If your paper states that all of the results have been formalized, the accompanying formalization had better back this up: it has to include formal statements corresponding to all of the "theorems", "lemmas" etc. in the paper. Likewise, presenting results about a full system in a paper and giving an accompanying proof for a simpler subsystem (without explaining this) is also misrepresentation. It is far better to be honest about any limitations in the formalization than to falsely claim that everything has been formalized when it hasn't - the latter is, in my view, clear grounds for rejection.**Explain representation issues and any resulting differences between the paper and formal presentation.**De Bruijn and locally nameless representations tend to be superficially different than the paper syntax - e.g. due to extra constructors for parameters. Higher-order abstract syntax representations sometimes have an even bigger gap between the natural paper formulation and the formal version. State which representation technique is being used and identify any artifacts related to it - particularly justifying any differences that might lead readers to conclude that you haven't proved what you claim to have proved. For bonus points, sketch a proof of adequacy.

**Make it easy for reviewers (and future readers) to determine which**

**"paper" theorem corresponds to which "formal" theorem.**This saves reader effort and, more importantly, heads off reviewers who might otherwise get confused and think that you haven't proved something you said you proved. This can be done several ways: in the README, or by leaving comments near all the formal statements (saying "this is theorem 4.2 in the paper"), or by using theorem names that make the relationship blindingly obvious (such as "theorem_4_2"). The third option can be hard to maintain, though. Another helpful thing to do is to group the results in the formalization into separate files corresponding to the relevant sections of the paper, but this is not always sensible.

**Every use of "escape clauses"**

**must be justified.**Most systems include some feature that instructs the prover to accept some statement as proved without justification. These can be useful during development of a proof (e.g. so that you can use a lemma to see if it is helpful in a larger proof, before putting the effort into trying to prove it.) However, because these constructs allow you to "prove" anything, their use without explanation in a claimed "full formalization" is a huge red flag. There are reasonable and justifiable uses for these, such as a semantic argument that the foundational system being used is sound if certain axioms are added, or a "paper" proof of a result that seems particularly tricky to prove formally. But such uses are potential sources of error that should be acknowledged and justified.

## 2 Comments:

Thanks for this. See also http://math.andrej.com/2013/08/19/how-to-review-formalized-mathematics/ — while there's lots of agreement between these posts, each has some points missing from the other.

Thanks! I hadn't seen this and should have looked around to see if this topic had already been covered. I definitely agree with Andrej's points, especially concerning the fact that reviewers and journals/publishers also have to think about their role, and concerning archivability/copyright. (This is a potentially tangled issue having a lot to do with the challenges of reproducibility for CS/computational science, which I touched on in a previous post.)

## Post a Comment

Subscribe to Post Comments [Atom]

<< Home