Do we have to optimize this code less just because of corner cases like the above?
As it turns out, no we don't -- there are some situations where it is perfectly fine to do a pointer-integer cast *without* having the "exposure" side-effect.
Specifically, this is the case if we never intend to cast the integer back to a pointer!
-That might seem like a niche case, but it turns out that most of the time, we can avoid 'bare' integer-pointer casts, and instead use an operation like [`with_addr`](https://doc.rust-lang.org/nightly/std/primitive.pointer.html#method.with_addr) that explicitly specifies which provenance to use for the newly created pointer.
+That might seem like a niche case, but it turns out that most of the time, we can avoid 'bare' integer-pointer casts, and instead use an operation like [`with_addr`](https://doc.rust-lang.org/nightly/std/primitive.pointer.html#method.with_addr) that explicitly specifies which provenance to use for the newly created pointer.[^with_addr]
This is more than enough for low-level pointer shenanigans like pointer tagging, as [Gankra demonstrated](https://gankra.github.io/blah/tower-of-weakenings/#strict-provenance-no-more-getting-lucky).
Rust's [Strict Provenance experiment](https://doc.rust-lang.org/nightly/std/ptr/index.html#strict-provenance) aims to determine whether we can use operations like `with_addr` to replace basically all integer-pointer casts.
+[^with_addr]: `with_addr` has been unstably added to the Rust standard library very recently. Such an operation has been floating around in various discussions in the Rust community for quite a while, and it has even made it into [an academic paper](https://iris-project.org/pdfs/2022-popl-vip.pdf) under the name of `copy_alloc_id`. Who knows, maybe one day it will find its way into the C standard as well. :)
+
As part of Strict Provenance, Rust now has a second way of casting pointers to integers, `ptr.addr()`, which does *not* "expose" the permission of the underlying pointer, and hence can be treated like a pure operation![^experiment]
We can do shenanigans on the integer representation of a pointer *and* have all these juicy optimizations, as long as we don't expect bare integer-pointer casts to work.
As a bonus, this also makes Rust work nicely on CHERI *without* a 128bit wide `usize`, and it helps Miri, too.
(The set of valid guesses in C is just a lot more restricted since they do not have `wrapping_offset`, and the model does not cover `restrict`.
That means they can actually feasibly give an algorithm for how to do the guessing.
They don't have to invoke scary terms like "angelic non-determinism", but the end result is the same -- and to me, the fact that it is equivalent to angelic non-determinism is what justifies this as a reasonable semantics.
-Presenting this as a concrete algorithm to pick a suitable provenance is then just a stylistic choice.)
+Presenting this as a concrete algorithm to pick a suitable provenance is then just a stylistic choice.
+Kudos go to Michael Sammler for opening my eyes to this interpretation of "user disambiguation".)
What is left is the question of how to handle pointer-integer transmutation, and this is where the roads are forking.
PNVI-ae-udi explicitly says loading from a union field at integer type exposes the provenance of the pointer being loaded, if any.
So far, this all applies to LLVM as a Rust and C backend equally, so I don't think there are any good alternatives.
On the plus side, adapting this strategy for `inttoptr` and `ptrtoint` means that the recent LLVM ["Full Restrict Support"](https://lists.llvm.org/pipermail/llvm-dev/2019-March/131127.html) can also handle pointer-integer round-trips "for free"!
+Adding `with_addr`/`copy_alloc_id` to LLVM is not strictly necessary, since it can be implemented with `getelementptr` (without `inbounds`).
+However, optimizations don't seem to always deal well with that pattern, so it might still be a good idea to add this as a primitive operation to LLVM.
+
Where things become more subtle is around pointer-integer transmutation.
If LLVM wants to keep doing replacement of `==`-equal integers (which I strongly assume to be the case), *something* needs to give: my first example, with casts replaced by transmutation, shows a miscompilation.
If we focus on doing an `i64` load of a pointer value (e.g. as in the LLVM IR produced by `transmute_union`, or pointer-based transmutation in Rust), what are the options?