From: Ralf Jung Date: Mon, 11 Apr 2022 20:29:51 +0000 (-0400) Subject: more precision around restrict X-Git-Url: https://git.ralfj.de/web.git/commitdiff_plain/1a099d3b4bc546d4265da1cfcb180b6373d65701 more precision around restrict --- diff --git a/personal/_posts/2022-04-11-provenance-exposed.md b/personal/_posts/2022-04-11-provenance-exposed.md index 0dcf9e0..56c7f8f 100644 --- a/personal/_posts/2022-04-11-provenance-exposed.md +++ b/personal/_posts/2022-04-11-provenance-exposed.md @@ -23,12 +23,14 @@ but even when compiling for AMD64 CPUs, compilers act "as if" pointers had such ## Dead cast elimination considered harmful -The key ingredient that will help us understand the nuances of provenance is `restrict`, a C keyword to promise that a given pointer `x` does not alias any other pointer not derived from `x`. +The key ingredient that will help us understand the nuances of provenance is `restrict`, a C keyword to promise that a given pointer `x` does not alias any other pointer not derived from `x`.[^restrict] This is comparable to the promise that a `&mut T` in Rust is unique. However, just like last time, we want to consider the limits that `restrict` combined with integer-pointer casts put on an optimizing compiler -- so the actual programming language that we have to be concerned with is the IR of that compiler. Nevertheless I will use the more familiar C syntax to write down this example; you should think of this just being notation for the "obvious" equivalent function in LLVM IR, where `restrict` is expressed via `noalias`. Of course, if we learn that the IR has to put some limitations on what code may do, this also applies to the surface language -- so we will be talking about all three (Rust, C, LLVM) quite a bit. +[^restrict]: The exact semantics of `restrict` are subtle and I am not aware of a formal definition. (Sadly, the one in the C standard does not really work, as you can see when you try to apply it to my example.) My understanding is as follows: `restrict` promises that this pointer, and all pointers derived from it, will not be used to perform memory accesses that *conflict* with any access done by pointers outside of that set. A "conflict" arises when two memory accesses overlap and at least one of them is a write. This promise is scoped to the duration of the function call when `restrict` appears in an argument type; I have no good idea for what the scope of the promise is in other situations. + With all that out of the way, consider the following program: {% highlight c %} #include @@ -161,7 +163,7 @@ To explain what that side-effect is, we have to get deep into the pointer proven Specifically, `x` has permission to access `i[0]` (declared in `main`), and `y` has permission to access `i[1]`.[^dyn] `y2` just inherits the permission from `y`. -[^dyn]: Actually, this is not quite how `restrict` works. The exact set of locations these pointers can access is determined *dynamically*, and the only constraint is that they cannot be used to access *the same location* (except if both are just doing a load). However, I carefully picked this example so that these subtleties do not change anything. +[^dyn]: As mentioned in a previous footnote, this is not actually how `restrict` works. The exact set of locations these pointers can access is determined *dynamically*, and the only constraint is that they cannot be used to access *the same location* (except if both are just doing a load). However, I carefully picked this example so that these subtleties should not change anything. But which permission does `ptr` get? Since integers do not carry provenance, the details of this permission information are lost during a pointer-integer cast, and have to somehow be 'restored' at the integer-pointer cast.