The CompCert verified compiler

Commented Coq development

Version 3.13, 2023-07-04

Chamois fork for the KVX backend: version 2024-04-29

The Chamois CompCert fork for the KVX backend

This web page is a patched version of the table of contents of the official CompCert sources documentation, as given on the CompCert Web site. The unmodified parts of this table appear in gray.

A high-level view of this CompCert backend is provided by the following documents:

The people responsible for this version are

and formerly

with contributions of:

Introduction

CompCert is a compiler that generates ARM, PowerPC, RISC-V and x86 assembly code from CompCert C, a large subset of the C programming language. The particularity of this compiler is that it is written mostly within the specification language of the Coq proof assistant, and its correctness --- the fact that the generated assembly code is semantically equivalent to its source program --- was entirely proved within the Coq proof assistant.

High-level descriptions of the CompCert compiler and its proof of correctness can be found in the following papers (in increasing order of technical details):

This Web site gives a commented listing of the underlying Coq specifications and proofs. Proof scripts are folded by default, but can be viewed by clicking on "Proof". Some modules (written in italics below) differ between the five target architectures. The PowerPC versions of these modules are shown below; the AArch64, ARM, x86 and RISC-V versions can be found in the source distribution.

This development is a work in progress; some parts have substantially changed since the overview papers above were written.

The complete sources for CompCert can be downloaded from the Git repository or the CompCert Web site.

This document and the CompCert sources are copyright Institut National de Recherche en Informatique et en Automatique (INRIA) and AbsInt Angewandte Informatik GmbH, and are distributed under the terms of the following license.

Table of contents

Compiler options specific to the Chamois fork

See this page.

Oracles' documentation

The various oracles validated by translation in our CompCert version are documented here. Most of them rely on the symbolic execution validator provided with the BTL IR, but there are also some specialized oracles helping in specific passes. Note that this documentation only covers oracles from the Chamois fork, not those of the mainline CompCert.

General-purpose libraries, data structures and algorithms

The Impure library (of Boulmé)

The abstractbb library, introduced for Aarch64 and KVX cores

This library introduces a block intermediate representation used for postpass scheduling verification. It might be extended to others backends.

Definitions and theorems used in many parts of the development

Source, intermediate and target languages: syntax and semantics

Languages introduced in the Chamois version

Generic (or possibly adaptable) Intermediate Representations (IR): Specific to KVX:

Compiler passes

Pass Source & target Compiler code Correctness proof
Pulling side-effects out of expressions;
fixing an evaluation order
CompCert C to Clight SimplExpr SimplExprspec
SimplExprproof
Pulling non-adressable scalar local variables out of memory Clight to Clight SimplLocals SimplLocalsproof
Simplification of control structures;
explication of type-dependent computations
Clight to Csharpminor Cshmgen Cshmgenproof
Stack allocation of local variables
whose address is taken;
simplification of switch statements
Csharpminor to Cminor Cminorgen Cminorgenproof
Recognition of operators
and addressing modes;
if-conversion
Cminor to CminorSel Selection
SelectOp
SelectLong
SelectDiv
SplitLong
Selectionproof
SelectOpproof
SelectLongproof
SelectDivproof
SplitLongproof
Construction of the CFG,
3-address code generation
CminorSel to RTL RTLgen RTLgenspec
RTLgenproof
Recognition of tail calls RTL to RTL Tailcall Tailcallproof
Recognition of tail recursion calls RTL to RTL Tailrec Tailrecproof
Function inlining RTL to RTL Inlining Inliningspec
Inliningproof
Passes introduced for profiling (for later use in trace selection)
Insert profiling annotations (for recording experiments -- see PROFILE.md). RTL to RTL Profiling Profilingproof
Update ICond nodes (from recorded experiments -- see PROFILE.md). RTL to RTL ProfilingExploit ProfilingExploitproof
Postorder renumbering of the CFG (1) RTL to RTL Renumber Renumberproof
[CSE1] Common subexpression elimination (1) RTL to RTL CSE
CombineOp
CSEproof
CombineOpproof
[Duplicate pass] Static Prediction + Unroll single
Add static prediction information to Icond nodes
Unrolls a single iteration of innermost loops
RTL to RTL Duplicate (generic checker) Duplicateproof (generic proof)
Duplicatepasses (several passes from several oracles)
Postorder renumbering of the CFG (2) RTL to RTL Renumber Renumberproof
[Duplicate pass] Tail Duplication
Performs tail duplication on interesting traces to form superblocks RTL to RTL Duplicate (generic checker) Duplicateproof (generic proof)
Duplicatepasses (several passes from several oracles)
Postorder renumbering of the CFG (3) RTL to RTL Renumber Renumberproof
[Duplicate pass] Unroll Body
Unrolling the body of innermost loops RTL to RTL Duplicate (generic checker) Duplicateproof (generic proof)
Duplicatepasses (several passes from several oracles)
Postorder renumbering of the CFG (4) RTL to RTL Renumber Renumberproof
[Duplicate pass] Loop Rotate
Loop rotation RTL to RTL Duplicate (generic checker) Duplicateproof (generic proof)
Duplicatepasses (several passes from several oracles)
Postorder renumbering of the CFG (5) RTL to RTL Renumber Renumberproof
Constant propagation RTL to RTL Constprop
ConstpropOp
Constpropproof
ConstproppOproof
Integer Promotion Pass (RISC-V only) RTL to RTL IntPromotionCommon IntPromotion
Postorder renumbering of the CFG (6) RTL to RTL Renumber Renumberproof
[CSE1] Common subexpression elimination (2) RTL to RTL CSE
CombineOp
CSEproof
CombineOpproof
[CSE2] Common subexpression elimination RTL to RTL CSE2
CSE2proof
[CSE3] Common subexpression elimination (1) RTL to RTL CSE3
CSE3proof
KillUselessMoves: removing useless moves instructions after CSE3 RTL to RTL KillUselessMoves
KillUselessMovesproof
Forwarding Moves RTL to RTL ForwardMoves
ForwardMovesproof
Redundancy elimination (1) RTL to RTL Deadcode Deadcodeproof
RTL Tunneling RTL to RTL RTLTunneling
RTLTunnelingproof
Allnontrap: transforming loads to non-trapping ones RTL to RTL Allnontrap
Allnontrapproof
Removal of unused static globals RTL to RTL Unusedglob Unusedglobproof
Passes introduced for BTL
Block selection (with Liveness information) RTL to BTL RTLtoBTL RTLtoBTLproof
BTLmatchRTL
Block Optimizations BTL to BTL BTL_BlockSimulation BTL_BlockSimulationproof
with BTL_Invariants (syntax of symbolic expressions and invariants);
BTL_SEtheory (the theory of symbolic execution on BTL_);
BTL_SEsimuref (the low-level specifications of the simulation checker);
BTL_SEimpl_prelude (the implementation's common definitions);
BTL_SEimpl_SE (the Symbolic Execution implementation);
BTL_SEimpl_SI (the Symbolic Invariants implementation);
and BTL_SEimpl_check (the simulation checker with hash-consing).
Forgeting blocks BTL to RTL BTLtoRTL BTLtoRTLproof
BTLmatchRTL
Postorder renumbering of the CFG (7) RTL to RTL Renumber Renumberproof
Passes from register allocation
Register allocation (validation a posteriori) RTL to LTL Allocation Allocationproof
Branch tunneling LTL to LTL LTLTunneling LTLTunnelingproof
Linearization of the CFG LTL to Linear Linearize Linearizeproof
Removal of unreferenced labels Linear to Linear CleanupLabels CleanupLabelsproof
Synthesis of debugging information Linear to Linear Debugvar Debugvarproof
Laying out the activation records Linear to Mach Stacking
Bounds
Stacklayout
Stackingproof
Separation
Passes introduced for backends with postpass scheduling
Reconstruction of basic-blocks at Mach level Mach to Machblock Machblockgen ForwardSimulationBlock
Machblockgenproof
Emission of purely sequential assembly code Machblock to Asmblock Asmblockgen Asmblockgenproof0
Asmblockgenproof1
Asmblockgenproof
Bundling (and basic-block scheduling) Asmblock to Asmvliw PostpassScheduling
using Asmblockdeps
and the abstractbb library
PostpassSchedulingproof
Asmblockprops
Flattening bundles (only a bureaucratic operation) Asmvliw to Asm Asmgen Asmgenproof (whole simulation proof from Mach to Asm)

All together (there are many more passes than on vanilla CompCert: their order is specified in Compiler)

Static analyses

The following static analyses are performed over the RTL intermediate representation to support optimizations such as constant propagation, CSE, and dead code elimination.

Type systems

The type system of CompCert C is fully formalized. For some intermediate languages of the back-end, simpler type systems are used to statically capture well-formedness conditions.