-Which brings me to my main conclusion for this post: I think the way optimizing compilers for these low-level languages are built is fundamentally flawed.
-The current approach, which one might call "optimizations-first", is to largely think of the semantics of a compiler IR in terms of the set of optimizations that it enables.[^weak-mem]
-However, each additional optimization puts another constraint on the IR semantics---how can we be sure that there even is a way to satisfy all constraints in a single language?
-The only way I know to ensure that the constraint set remains satisfiable, i.e., to ensure that performing all these optimizations in any order does not introduce bugs, is to pick a consistent semantics for the IR and then show each optimization to be *correct* under those semantics, in the sense defined above.
+Which brings me to my main conclusion for this post: to avoid the problem of incompatible optimizations, I think we need to take compiler IRs more serious as programming languages in their own right, and give them a precise specification---including all the UB.
+Now, you may object by saying that LLVM has an extensive LangRef.
+And still, by reading the LLVM specification one could convince oneself that each of the three optimizations above is correct, which as we have seen is contradictory.
+What is missing? Do we need a formal mathematical definition to avoid such ambiguities?
+I do not think so; I think there is something simpler that will already help a lot.
+The source of the contradiction is that some parts of the specification *implicitly* assume that pointers have provenance, which is easy to forget when considering other operations.
+That's why I think it is very important to make such assumptions more explicit: the specification should *explicitly* describe the information that "makes up" a value, which will include things like provenance and whether the value is (wholly or partially) initialized.[^C]
+This information needs to be extensive enough such that a hypothetical interpreter can use it to detect all the UB.
+Doing so makes it obvious that a pointer has provenance, since otherwise it is impossible to correctly check for out-of-bounds pointer arithmetic while also permitting one-past-the-end pointers.