-And I cannot blame them; the way compiler development typically works nowadays, I think bugs like this are inevitable.
-Pointer provenance is just a particularly good example of where the current approach failed, but it is not the only case.
-For example, [§2.3 of this paper](https://plv.mpi-sws.org/validc/paper.pdf) (see Figure 3 for the code) shows how a sequence of two optimizations can lead to a miscompilation, where the first optimization is correct under the LLVM concurrency model, and the second optimization is correct under the C++11 concurrency model---but there is no concurrency model under which *both* optimizations are correct, so each compiler (or rather, each compiler IR) needs to pick one or the other.
-
-Which brings me to my main conclusion for this post: I think the way optimizing compilers for these low-level languages are built is fundamentally flawed.
-The current approach, which one might call "optimizations-first", is to largely think of the semantics of a compiler IR in terms of the set of optimizations that it enables.[^weak-mem]
-However, each additional optimization puts another constraint on the IR semantics---how can we be sure that there even is a way to satisfy all constraints in a single language?
-The only way I know to ensure that the constraint set remains satisfiable, i.e., to ensure that performing all these optimizations in any order does not introduce bugs, is to pick a consistent semantics for the IR and then show each optimization to be *correct* under those semantics, in the sense defined above.
-
-[^weak-mem]: We can also clearly see this approach in most discussions around weak memory concurrency effects, which typically are all about which reorderings the compiler is allowed to perform and how barriers can be used to prevent the reorderings. This is the optimizations-first approach in action.
-
-To avoid the problem of incompatible optimizations, I think we need to take compiler IRs more serious as programming languages in their own right, and properly define the Abstract Machine that describes their semantics---including all the UB!
-I call this "language-first".
-This language definition does not need to be a formal artifact in a proof assistant, but the description needs to be precise enough such that there are no ambiguities, and such that for each desired optimization we can evaluate whether it is *correct* for this IR or not.
-One great way to achieve this is to implement a *reference interpreter* for the IR, an interpreter that not only evaluates IR programs but also says if there was any UB triggered by this execution.
-Writing such an interpreter is a great exercise because it requires explicitly writing out what one could call the "state space" of the Abstract Machine:
-just what *is* a value, which pieces of data are needed to fully describe it, what kind of information is stored in memory, and so on.
-Doing so makes it *immediately obvious* that a pointer has provenance, since otherwise it is impossible to correctly check for out-of-bounds accesses.
-
+And I cannot blame them; the way compiler development typically works, I think bugs like this are inevitable: when exactly UB arises in an IR is often only loosely specified, so evaluating whether some optimization is *correct* in the sense defined above is basically impossible.
+Pointer provenance is just a particularly good (and subtle) example.
+The warm-up above is another trivial case of this (albeit one where the existing specification is sufficient): loop-invariant code motion of arithmetic operations and UB on arithmetic overflow can both be *correct*, but not in the same IR.
+For another example, see [§2.3 of this paper](https://plv.mpi-sws.org/validc/paper.pdf) (Figure 3 contains the code) which shows how a sequence of two optimizations can lead to a miscompilation, where the first optimization is *correct* under the LLVM concurrency model, and the second optimization is *correct* under the C++11 concurrency model---but there is no concurrency model under which *both* optimizations are correct, so each compiler (or rather, each compiler IR) needs to pick one or the other.
+Finally, this [paper on `undef` and `poison`](https://www.cs.utah.edu/~regehr/papers/undef-pldi17.pdf) gives examples for optimizations that are broken by the presence of `undef` in LLVM, and describes some of the trade-offs that arise when defining the semantics of `poison`.
+
+Which brings me to my main conclusion for this post: to avoid the problem of incompatible optimizations, I think we need to take compiler IRs more serious as programming languages in their own right, and give them a precise specification---including all the UB.
+Now, you may object by saying that LLVM has an [extensive LangRef](https://llvm.org/docs/LangRef.html), and still, by reading the LLVM specification one could convince oneself that each of the three optimizations above is correct, which as we have seen is contradictory.
+What is missing? Do we need a formal mathematical definition to avoid such ambiguities?
+I do not think so; I think there is something simpler that will already help a lot.
+The source of the contradiction is that some parts of the specification *implicitly* assume that pointers have provenance, which is easy to forget when considering other operations.
+That's why I think it is very important to make such assumptions more explicit: the specification should *explicitly* describe the information that "makes up" a value, which will include things like provenance and [whether the value is (wholly or partially) initialized]({% post_url 2018-07-24-pointers-and-bytes %}).[^C]
+This information needs to be extensive enough such that a hypothetical interpreter can use it to detect all the UB.
+Doing so makes it obvious that a pointer has provenance, since otherwise it is impossible to correctly check for out-of-bounds pointer arithmetic while also permitting one-past-the-end pointers.
+
+[^C]: The astute reader will notice that this information is also absent in the C and C++ specifications. That is indeed one of my main criticisms of these specifications. It leads to many open questions not only when discussing provenance (including `restrict`, which I did not even go into here) but also when discussing indeterminate and unspecified values or the rules around effective types.
+
+This is my bar for what I consider a sufficiently precise language specification: it needs to contain all the information needed such that writing a UB-checking interpreter is just a matter of "putting in the work", but does not require introducing anything new that is not described in the specification.
+
+Ideally, the interpreter is not just hypothetical.