X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/c0e17f43f2947869ac855e6b886512de1c772682..2c1c756ec793faa5df5a1dd7abfc3f5c92f1f924:/personal/_posts/2018-07-24-pointers-and-bytes.md?ds=sidebyside diff --git a/personal/_posts/2018-07-24-pointers-and-bytes.md b/personal/_posts/2018-07-24-pointers-and-bytes.md index 8f7089e..23371fe 100644 --- a/personal/_posts/2018-07-24-pointers-and-bytes.md +++ b/personal/_posts/2018-07-24-pointers-and-bytes.md @@ -1,6 +1,6 @@ --- title: "Pointers Are Complicated, or: What's in a Byte?" -categories: internship rust +categories: internship rust programming forum: https://internals.rust-lang.org/t/pointers-are-complicated-or-whats-in-a-byte/8045 --- @@ -32,13 +32,18 @@ int test() { } {% endhighlight %} It would be beneficial to be able to optimize the final read of `y[0]` to just return `42`. +C++ compilers regularly perform such optimizations as they are crucial for generating high-quality assembly.[^perf] The justification for this optimization is that writing to `x_ptr`, which points into `x`, cannot change `y`. +[^perf]: To be fair, the are *claimed* to be crucial for generating high-quality assembly. The claim sounds plausible to me, but unfortunately, I do not know of a systematic study exploring the performance benefits of such optimizations. + However, given how low-level a language C++ is, we can actually break this assumption by setting `i` to `y-x`. Since `&x[i]` is the same as `x+i`, this means we are actually writing `23` to `&y[0]`. Of course, that does not stop C++ compilers from doing these optimizations. -To allow this, the standard declares our code to have [undefined behavior]({% post_url 2017-07-14-undefined-behavior %}). +To allow this, the standard declares our code to have [undefined behavior]({% post_url 2017-07-14-undefined-behavior %}).[^0] + +[^0]: An argument could be made that compilers should just not do such optimizations to make the programming model simpler. This is a discussion worth having, but the point of this post is not to explore this trade-off, it is to explore the consequences of the choices made in C++. First of all, it is not allowed to perform pointer arithmetic (like `&x[i]` does) that goes [beyond either end of the array it started in](https://timsong-cpp.github.io/cppwp/n4140/expr.add#5). Our program violates this rule: `x[i]` is outside of `x`, so this is undefined behavior. @@ -74,7 +79,7 @@ It does *not* point at an actual element of another object *even if they have th The key point here is that just because `x_ptr` and `&y[0]` point to the same *address*, that does not make them *the same pointer*, i.e., they cannot be used interchangeably: `&y[0]` points to the first element of `y`; `x_ptr` points past the end of `x`. -If we replace `*x_ptr = 23` by `*&y[0] = 0`, we change the meaning of the program, even though the two pointers have been tested for equality. +If we replace `*x_ptr = 23` by `*&y[0] = 23`, we change the meaning of the program, even though the two pointers have been tested for equality. This is worth repeating: @@ -121,6 +126,13 @@ This is another example of using a "virtual machine" that's different from the r Here's a simple proposal (in fact, this is the model of pointers used in [CompCert](https://hal.inria.fr/hal-00703441/document) and my [RustBelt work]({% post_url 2017-07-08-rustbelt %}), and it is also how [miri](https://github.com/solson/miri/) implements [pointers](https://github.com/rust-lang/rust/blob/fefe81605d6111faa8dbb3635ab2c51d59de740a/src/librustc/mir/interpret/mod.rs#L121-L124)): A pointer is a pair of some kind of ID uniquely identifying the *allocation*, and an *offset* into the allocation. +If we defined this in Rust, we might write +{% highlight rust %} +struct Pointer { + alloc_id: usize, + offset: isize, +} +{% endhighlight %} Adding/subtracting an integer to/from a pointer just acts on the offset, and can thus never leave the allocation. Subtracting a pointer from another is only allowed when both point to the same allocation (matching [C++](https://timsong-cpp.github.io/cppwp/n4140/expr.add#6)).[^2] @@ -166,7 +178,7 @@ We have to say what the value of `v` is, so we have to find some way to answer t (And this is an entirely separate issue from the problem with multiplication that came up in the last section. We just assume some abstract type `Pointer`.) We cannot represent a byte of a pointer as an element of `0..256`. -Essentially, if we use a naive model of memory, the extra "hidden" part of a pointer (the one that makes it more than just an integer) would be lost whne a pointer is stored to memory and loaded again. +Essentially, if we use a naive model of memory, the extra "hidden" part of a pointer (the one that makes it more than just an integer) would be lost when a pointer is stored to memory and loaded again. We have to fix this, so we have to extend our notion of a "byte" to accomodate that extra state. So, a byte is now *either* an element of `0..256` ("raw bits"), *or* the n-th byte of some abstract pointer. If we were to implement our memory model in Rust, this might look as follows: @@ -230,6 +242,8 @@ Finally, `Uninit` is also a better choice for interpreters like miri. Such interpreters have a hard time dealing with operations of the form "just choose any of these values" (i.e., non-deterministic operations), because if they want to fully explore all possible program executions, that means they have to try every possible value. Using `Uninit` instead of an arbitrary bit pattern means miri can, in a single execution, reliably tell you if your programs uses uninitialized values incorrectly. +**Update:** Since writing this section, I have written an entire [post dedicated to uninitialized memory and "real hardware"]({% post_url 2019-07-14-uninit %}) with more details, examples and references. **/Update** + ## Conclusion We have seen that in languages like C++ and Rust (unlike on real hardware), pointers can be different even when they point to the same address, and that a byte is more than just a number in `0..256`.