+++ /dev/null
----
-title: "Stacked Borrows Implemented"
-categories: internship rust
----
-
-Three months ago, I proposed [Stacked Borrows]({% post_url
-2018-08-07-stacked-borrows %}) as a model for defining what kinds of aliasing
-are allowed in Rust, and the idea of a [validity invariant]({% post_url
-2018-08-22-two-kinds-of-invariants %}) that has to be maintained by all code at
-all times. Since then I have been busy implementing both of these, and
-developed Stacked Borrows further in doing so. This post describes the latest
-version of Stacked Borrows, and reports my findings from the implementation
-phase: What worked, what did not, and what remains to be done. There will also
-be an opportunity for you to help the effort!
-
-<!-- MORE -->
-
-What Stacked Borrows does is that it defines a semantics for Rust programs such
-that some things about references always hold true for every valid execution
-(meaning executions where no [undefined behavior]({% post_url
-2017-07-14-undefined-behavior %}) occurred): `&mut` references are unique (we
-can rely on no accesses by other functions happening to the memory they point
-to), and `&` references are immutable (we can rely on no writes happening to the
-memory they point to, unless there is an `UnsafeCell`). Usually we have the
-borrow checker guarding us against such nefarious violations of reference type
-guarantees, but alas, when we are writing unsafe code, the borrow checker cannot
-help us. We have to define a set of rules that makes sense even for unsafe
-code.
-
-I will try to explain at least parts of this model again in this post. The
-explanation is not going to be the same as last time, not only because it
-changed a bit, but also because I think I understand the model better myself
-now.
-
-Ready? Let's get started. I hope you brought some time, because this is a
-rather lengthy post. If you are not interested in a detailed description of
-Stacked Borrows, you can skip most of this post and go right to [section 4]. If
-you only want to know how to help, jump to [section 6].
-
-## 1 Enforcing Uniqueness
-
-Let us first ignore the part about `&` references being immutable and focus on
-uniqueness of mutable references. Namely, we want to define our model in a way
-that calling the following function will trigger undefined behavior:
-
-{% highlight rust %}
-fn demo0() {
- let x = &mut 1u8;
- let y = &mut *x;
- *y = 5;
- // Write through a pointer aliasing `y`
- *x = 3;
- // Use `y` again, asserting it is still exclusive
- let _val = *y;
-}
-{% endhighlight %}
-
-We want this function to be disallowed because between two uses of `y`, there is
-a use of another pointer for the same location, violating the fact that `y`
-should be unique.
-
-Notice that this function does not compile, the borrow checker won't allow it.
-That's great! It is undefined behavior, after all. But the entire point of
-this exercise is to explain *why* we have undefined behavior here *without*
-referring to the borrow checker, because we want to have rules that also work
-for unsafe code.
-
-To be able to do this, we have to pretend our machine has two things which real
-CPUs do not have. This is an example of adding "shadow state" or "instrumented
-state" to the "virtual machine" that we [use to specify Rust]({% post_url
-2017-06-06-MIR-semantics %}). This is not an uncommon approach, often times
-source languages make distinctions that do not appear in the actual hardware. A
-related example is
-[valgrind's memcheck](http://valgrind.org/docs/manual/mc-manual.html) which
-keeps track of which memory is initialized to be able to detect memory errors:
-During a normal execution, uninitialized memory looks just like all other
-memory, but to figure out whether the program is violating C's memory rules, we
-have to keep track of some extra state.
-
-For stacked borrows, the extra state looks as follows:
-
-1. For every pointer, we keep track of an extra "tag" that records when and how
- this pointer was created.
-2. For every location in memory, we keep track of a stack of tags, indicating
- which tag a pointer must have to be allowed to access this location.
-
-These exist separately, i.e., when a pointer is stored in memory, then we both
-have a tag stored as part of this pointer value, and every byte occupied by the
-pointer has a stack regulating access to this location. Remember,
-[a byte is more than a `u8`]({% post_url 2018-07-24-pointers-and-bytes %}).
-Also these two do not interact, i.e., when loading a pointer from memory, we
-just load the tag that was stored as part of this pointer. The stack of a
-location, and the tag of a pointer stored at some location, do not have any
-effect on each other.
-
-In our example, there are two pointers (`x` and `y`) and one location of
-interest (the one both of these pointers point to, initialized with `1u8`).
-When we initially create `x`, it gets tagged `Uniq(0)` to indicate that it is a
-unique reference, and the location's stack has `Uniq(0)` at its top to indicate
-that this is the latest reference allowed to access said location. When we
-create `y`, it gets a new tag, `Uniq(1)`, so that we can distinguish it from
-`x`. We also push `Uniq(1)` onto the stack, indicating not only that `Uniq(1)`
-is the latest reference allow to access, but also that it is "derived from"
-`Uniq(0)`: The tags higher up in the stack are descendants of the ones further
-down.
-
-So we have: `x` tagged `Uniq(0)`, `y` tagged `Uniq(1)`, and the stack contains
-`[Uniq(0), Uniq(1)]`. (Top of the stack is on the right.)
-
-When we use `y` to access the location, we make sure its tag is at the top of
-the stack: Check, no problem here. When we use `x`, we do the same thing: Since
-it is not at the top yet, we pop the stack until it is, which is easy. Now the
-stack is just `[Uniq(0)]`. Now we use `y` again and... blast! Its tag is not
-on the stack. We have undefined behavior.
-
-In case you got lost, here is the source code with comments indicating the tags
-and the stack of the one location that interests us:
-
-{% highlight rust %}
-fn demo0() {
- let x = &mut 1u8; // tag: `Uniq(0)`
- // stack: [Uniq(0)]
-
- let y = &mut *x; // tag: `Uniq(1)`
- // stack: [Uniq(0), Uniq(1)]
-
- // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
- // Nothing changes.
- *y = 5;
- // stack: [Uniq(0), Uniq(1)]
-
- // Pop until `Uniq(0)`, the tag of `x`, is on top of the stack:
- // We pop `Uniq(1)`.
- *x = 3;
- // stack: [Uniq(0)]
-
- // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
- // That is not possible, hence we have undefined behavior.
- let _val = *y;
-}
-{% endhighlight %}
-
-Well, actually having undefined behavior here is good news, since that's what we
-wanted from the start! And since there is an implementation of the model in
-[miri](https://github.com/solson/miri/), you can try this yourself: The amazing
-@shepmaster has integrated miri into the playground, so you can
-[put the example there](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=d15868687f79072688a0d0dd1e053721)
-(adjusting it slightly to circumvent the borrow checker), then select "Tools -
-Miri" and it will complain (together with a rather unreadable backtrace, we sure
-have to improve that one):
-
-```
- --> src/main.rs:6:14
- |
-6 | let _val = *y;
- | ^^ Encountered reference with non-reactivatable tag: Borrow-to-reactivate Uniq(1245) does not exist on the stack
- |
-```
-
-## 2 Enabling Sharing
-
-If we just had unique pointers, Rust would be a rather dull language. Luckily
-enough, there are also two ways to have shared access to a location: through
-shared references (safely), and through raw pointers (unsafely). Moreover,
-shared references *sometimes* (but not when they point to an `UnsafeCell`)
-assert an additional guarantee: Their destination is immutable.
-
-For example, we want the following code to be allowed -- not least because this
-is actually safe code accepted by the borrow checker, so we better make sure
-this is not undefined behavior:
-
-{% highlight rust %}
-fn demo1() {
- let x = &mut 1u8;
- // Create several shared references, and we can also still read from `x`
- let y1 = &*x;
- let _val = *x;
- let y2 = &*x;
- let _val = *y1;
- let _val = *y2;
-}
-{% endhighlight %}
-
-However, the following code is *not* okay:
-
-{% highlight rust %}
-fn demo2() {
- let x = &mut 1u8;
- let y = &*x;
- // Create raw reference aliasing `y` and write through it
- let z = x as *const u8 as *mut u8;
- unsafe { *z = 3; }
- // Use `y` again, asserting it still points to the same value
- let _val = *y;
-}
-{% endhighlight %}
-
-If you
-[try this in miri](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=1bc8c2f432941d02246fea0808e2e4f4),
-you will see it complain:
-
-```
- --> src/main.rs:6:14
- |
-6 | let _val = *y;
- | ^^ Location is not frozen long enough
- |
-```
-
-How is it doing that, and what is a "frozen" location?
-
-To explain this, we have to extend the "shadow state" of our "virtual machine" a
-bit. First of all, we introduce a new kind of tag that a pointer can carry: A
-*shared* tag. The following Rust type describes the possible tags of a pointer:
-
-{% highlight rust %}
-pub type Timestamp = u64;
-pub enum Borrow {
- Uniq(Timestamp),
- Shr(Option<Timestamp>),
-}
-{% endhighlight %}
-
-You can think of the timestamp as a unique ID, but as we will see, for shared
-references, it is also important to be able to determine which of these IDs was
-created first. The timestamp is optional in the shared tag because that tag is
-also used by raw pointers, and for raw pointers, we are often not able to track
-when and how they are created (for example, when raw pointers are converted to
-integers and back).
-
-We use a separate type for the items on our stack, because there we do not need
-a timestamp for shared pointers:
-
-{% highlight rust %}
-pub enum BorStackItem {
- Uniq(Timestamp),
- Shr,
-}
-{% endhighlight %}
-
-And finally, a "borrow stack" consists of a stack of `BorStackItem`, together
-with an indication of whether the stack (and the location it governs) is
-currently *frozen*, meaning it may only be read, not written:
-
-{% highlight rust %}
-pub struct Stack {
- borrows: Vec<BorStackItem>, // used as a stack; never empty
- frozen_since: Option<Timestamp>, // virtual frozen "item" on top of the stack
-}
-{% endhighlight %}
-
-### 2.1 Executing the Examples
-
-Let us now look at what happens when we execute our two example programs. To
-this end, I will embed comments in the source code. There is only one location
-of interest here, so whenever I talk about a "stack", I am referring to the
-stack of that location.
-
-{% highlight rust %}
-fn demo1() {
- let x = &mut 1u8; // tag: `Uniq(0)`
- // stack: [Uniq(0)]; not frozen
-
- let y1 = &*x; // tag: `Shr(Some(1))`
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // Access through `x`. We first check whether its tag `Uniq(0)` is in the
- // stack (it is). Next, we make sure that either our item *or* `Shr` is on
- // top *or* the location is frozen. The latter is the case, so we go on.
- let _val = *x;
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // This is not an access, but we still dereference `x`, so we do the same
- // actions as on a read. Just like in the previous line, nothing happens.
- let y2 = &*x; // tag: `Shr(Some(2))`
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // Access through `y1`. Since the shared tag has a timestamp (1) and the type
- // (`u8`) does not allow interior mutability (no `UnsafeCell`), we check that
- // the location is frozen since (at least) that timestamp. It is.
- let _val = *y1;
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // Same as with `y2`: The location is frozen at least since 2 (actually, it
- // is frozen since 1), so we are good.
- let _val = *y2;
- // stack: [Uniq(0), Shr]; frozen since 1
-}
-{% endhighlight %}
-
-This example demonstrates a few new aspects. First of all, there are actually
-two operations that perform tag-related checks in this model (so far):
-Dereferencing a pointer (whether you have a `*`, also implicitly), and actual
-memory accesses. Operations like `&*x` are an example of operations that
-dereference a pointer without accessing memory. Secondly, *reading* through a
-mutable reference is actually okay *even when that reference is not exclusive*.
-It is only *writing* through a mutable reference that "re-asserts" its
-exclusivity. I will come back to these points later, but let us first go
-through another example.
-
-{% highlight rust %}
-fn demo2() {
- let x = &mut 1u8; // tag: `Uniq(0)`
- // stack: [Uniq(0)]; not frozen
-
- let y = &*x; // tag: `Shr(Some(1))`
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // The `x` here really is a `&*x`, but we have already seen above what
- // happens: `Uniq(0)` must be in the stack, but we leave it unchanged.
- let z = x as *const u8 as *mut u8; // tag irrelevant because raw
- // stack: [Uniq(0), Shr]; frozen since 1
-
- // A write access through a raw pointer: Unfreeze the location and make sure
- // that `Shr` is at the top of the stack.
- unsafe { *z = 3; }
- // stack: [Uniq(0), Shr]; not frozen
-
- // Access through `y`. There is a timestamp in the `Shr` tag, and the type
- // `u8` does not allow interior mutability, but the location is not frozen.
- // This is undefined behavior.
- let _val = *y;
-}
-{% endhighlight %}
-
-### 2.2 Dereferencing a Pointer
-[section 2.2]: #22-dereferencing-a-pointer
-
-As we have seen, we consider the tag of a pointer already when dereferencing it,
-before any memory access happens. The operation on a dereference never mutates
-the stack, but it performs some basic checks that might declare the program UB.
-The reason for this is twofold: First of all, I think we should require some
-basic validity for pointers that are dereferenced even when they do not access
-memory. Secondly, there is the practical concern for the implementation in miri:
-When we dereference a pointer, we are guaranteed to have type information
-available (crucial for things that depend on the presence of an `UnsafeCell`),
-whereas having type information on every memory access would be quite hard to
-achieve in miri.
-
-Notice that on a dereference, we have *both* a tag at the pointer *and* the type
-of a pointer, and the two might not agree which we do not always want to rule
-out (we might have raw or shared pointers with a unique tag, for example).
-
-The following checks are done on every pointer dereference, for every location
-covered by the pointer (`size_of_val` tells us how many bytes the pointer
-covers):
-
-1. If this is a raw pointer, do nothing and reset the tag used for the access to
- `Shr(None)`. Raw accesses are checked as little as possible.
-2. If this is a unique reference and the tag is `Shr(Some(_))`, that's an error.
-3. If the tag is `Uniq`, make sure there is a matching `Uniq` item with the same
- ID on the stack.
-4. If the tag is `Shr(None)`, make sure that either the location is frozen or
- else there is a `Shr` item on the stack.
-5. If the tag is `Shr(Some(t))`, then the check depends on whether the location
- is inside an `UnsafeCell` or not, according to the type of the reference.
- - Locations outside `UnsafeCell` must have `frozen_since` set to `t` or an
- older timestamp.
- - `UnsafeCell` locations must either be frozen or else have a `Shr` item in
- their stack (same check as if the tag had no timestamp).
-
-### 2.3 Accessing Memory
-[section 2.3]: #23-accessing-memory
-
-On an actual memory access, we know the tag of the pointer that was used to
-access (unless it was a raw pointer, in which case the tag we see is
-`Shr(None)`), and we know whether we are reading from or writing to the current
-location. We perform the following operations on all locations affected by the
-access:
-
-1. If the location is frozen and this is a read access, nothing happens. (even
- if the tag is `Uniq`).
-2. Unfreeze the location (set `frozen_since` to `None`). Either the location is
- already unfrozen, or this is a write.
-3. Pop the stack until the top item matches the tag of the pointer.
- - A `Uniq` item matches a `Uniq` tag with the same ID.
- - A `Shr` item matches any `Shr` tag (with or without timestamp).
- - When we are reading, a `Shr` item matches a `Uniq` tag.
-
- If we pop the entire stack without finding a match, then we have undefined
- behavior.
-
-To understand these rules better, try going back through the three examples we
-have seen so far and applying these rules for dereferencing pointers and
-accessing memory to understand how they interact.
-
-The only thing that is subtle and potentially surprising here is that we make a
-`Uniq` tag match a `Shr` item and also accept `Uniq` reads on frozen locations.
-This is required to make `demo1` work: Rust permits read accesses through
-mutable references even when they are not currently actually unique. Our model
-hence has to do the same.
-
-## 3 Retagging and Creating Raw Pointers
-
-We have talked quite a bit about what happens when we *use* a pointer. It is
-time we take a close look at *how pointers are created*. However, before we go
-there, I would like us to consider one more example:
-
-{% highlight rust %}
-fn demo3(x: &mut u8) -> u8 {
- some_function();
- *x
-}
-{% endhighlight %}
-
-The question is: Can we move the load of `x` to before the function call?
-Remember that the entire point of Stacked Borrows is to enforce a certain
-discipline when using references, in particular, to enforce uniqueness of
-mutable references. So we should hope that the answer to that question is "yes"
-(and that, in turn, is good because we might use it for optimizations).
-Unfortunately, things are not so easy.
-
-The uniqueness of mutable references entirely rests on the fact that the pointer
-has a unique tag: If our tag is at the top of the stack (and the location is not
-frozen), then any access with another tag will pop our item from the stack (or
-cause undefined behavior). This is ensured by the memory access checks (and the
-exception for matching `Uniq` tags with `Shr` items on reads does not affect
-this property). Hence, if our tag is *still* on the stack after some other
-accesses happened (and we know it is still on the stack every time we
-dereference the pointer, as per the dereference checks described above), we know
-that no access through a pointer with a different tag can have happened.
-
-### 3.1 Guaranteed Freshness
-
-However, what if `some_function` has an exact copy of `x`? We got `x` from our
-caller (whom we do not trust), maybe they used that same tag for another
-reference (copied it with `transmute_copy` or so) and gave that to
-`some_function`? There is a simple way we can circumvent this concern: Generate
-a new tag for `x`. If *we* generate the tag (and we know generation never emits
-the same tag twice, which is easy), we can be sure this tag is not used for any
-other reference. So let us make this explicit by putting a `Retag` instruction
-into the code where we generate new tags:
-
-{% highlight rust %}
-fn demo3(x: &mut u8) -> u8 {
- Retag(x);
- some_function();
- *x
-}
-{% endhighlight %}
-
-These `Retag` instructions are inserted by the compiler pretty much any time
-references are copied: At the beginning of every function, all inputs of
-reference type get retagged. On every assignment, if the assigned value is of
-reference type, it gets retagged. Moreover, we do this even when the reference
-value is inside the field of a `struct` or `enum`, to make sure we really cover
-all references. (This recursive descent is already implemented, but the
-implementation has not landed yet.) However, we do *not* descend recursively
-through references: Retagging a `&mut &mut u8` will only retag the *outer*
-reference.
-
-Retagging is the *only* operation that generates fresh tags. Taking a reference
-simply forwards the tag of the pointer we are basing this reference on.
-
-Here is our very first example with explicit retagging:
-
-{% highlight rust %}
-fn demo0() {
- let x = &mut 1u8;
- Retag(x); // tag of `x` gets changed to `Uniq(0)`
- // stack: [Uniq(0)]; not frozen
-
- let y = &mut *x;
- Retag(y); // tag of `y` gets changed to `Uniq(1)`
- // stack: [Uniq(0), Uniq(1)]; not frozen
-
- // Check that `Uniq(1)` is on the stack, then pop to bring it to the top.
- *y = 5;
- // stack: [Uniq(0), Uniq(1)]; not frozen
-
- // Check that `Uniq(0)` is on the stack, then pop to bring it to the top.
- *x = 3;
- // stack: [Uniq(0)]; not frozen
-
- // Check that `Uniq(1)` is on the stack -- it is not, hence UB.
- let _val = *y;
-}
-{% endhighlight %}
-
-For each reference, `Retag` does the following (we will slightly refine these
-instructions later) on all locations covered by the reference (again, according
-to `size_of_val`):
-
-1. Compute a fresh tag, `Uniq(_)` for a mutable reference and `Shr(Some(_))` for
- a shared reference.
-2. Perform the checks that would also happen when we dereference this reference.
-3. Perform the actions that would also happen when an actual access happens
- through this reference (for shared references a read access, for mutable
- references a write access).
-4. If the new tag is `Uniq`, push it onto the stack. (The location cannot be
- frozen: `Uniq` tags are only created for mutable references, and we just
- performed the actions of a write access to memory, which unfreezes
- locations.)
-5. If the new tag is `Shr`:
- - If the location is already frozen, we do nothing.
- - Otherwise:
- 1. Push a `Shr` item to the stack.
- 2. If the location is outside of `UnsafeCell`, it gets frozen with the
- timestamp of the new reference.
-
-One high-level way to think about retagging is that it computes a fresh tag, and
-then performs a reborrow of the old reference with the new tag.
-
-### 3.2 When Pointers Escape
-
-Creating a shared reference is not the only way to share a location: We can also
-create raw pointers, and if we are careful enough, use them to access a location
-from different aliasing pointers. (Of course, "careful enough" is not very
-precise, but the precise answer is the very model I am describing here.)
-
-To account for this, we need one final ingredient in our model: a special
-instruction that indicates that a reference was cast to a raw pointer, and may
-thus be accessed from these raw pointers in a shared way. Consider the
-[following example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=253868e96b7eba85ef28e1eabd557f66):
-
-{% highlight rust %}
-fn demo4() {
- let x = &mut 1u8;
- Retag(x); // tag of `x` gets changed to `Uniq(0)`
- // stack: [Uniq(0)]; not frozen
-
- // Make sure what `x` points to is accessible through raw pointers.
- EscapeToRaw(x);
- // stack: [Uniq(0), Shr]; not frozen
-
- let y1 = x as *mut u8;
- let y2 = y1;
- unsafe {
- // All of these first dereference a raw pointer (no checks, tag gets
- // ignored) and then perform a read or write access with `Shr(None)` as
- // the tag, which is already the top of the stack so nothing changes.
- *y1 = 3;
- *y2 = 5;
- *y2 = *y1;
- }
-
- // Writing to `x` again pops `Shr` off the stack, as per the rules for
- // write accesses.
- *x = 7;
- // stack: [Uniq(0)]; not frozen
-
- // Any further access through the raw pointers is undefined behavior, even
- // reads: The write to `x` re-asserted that `x` is the unique reference for
- // this memory.
- let _val = unsafe { *y1 };
-}
-{% endhighlight %}
-
-The behavior of `EscapeToRaw` is best described as "reborrowing for a raw
-pointer": The steps are the same as for `Retag` above, except that the new
-pointer's tag is `Shr(None)` and we do not freeze (i.e., we behave as if the
-entire pointee was inside an `UnsafeCell`).
-
-Knowing about both `Retag` and `EscapeToRaw`, you can now go back to `demo2` and
-should be able to fully explain why the stack changes the way it does not that
-example.
-
-### 3.3 The Case of the Aliasing References
-
-Everything I described so far was pretty much in working condition as of about a
-week ago. However, there was one thorny problem that I only discovered fairly
-late, and as usual it is best demonstrated by an example -- entirely in safe
-code:
-
-{% highlight rust %}
-fn demo_refcell() {
- let rc = &mut RefCell::new(23u8);
- Retag(rc); // tag gets changed to `Uniq(0)`
- // We will consider the stack of the location where `23` is stored; the
- // `RefCell` bookkeeping counters are not of interest.
- // stack: [Uniq(0)]
-
- // Taking a shared reference shares the location but does not freeze, due
- // to the `UnsafeCell`.
- let rc_shr = &*rc;
- Retag(rc_shr); // tag gets changed to `Shr(Some(1))`
- // stack: [Uniq(0), Shr]; not frozen
-
- // Lots of stuff happens here but it does not matter for this example.
- let mut bmut = rc_shr.borrow_mut();
-
- // Obtain a mutable reference into the `RefCell`.
- let mut_ref = &mut *bmut;
- Retag(mut_ref); // tag gets changed to `Uniq(2)`
- // stack: [Uniq(0), Shr, Uniq(2)]; not frozen
-
- // And at the same time, a fresh shared reference to its outside!
- // This counts as a read access through `rc`, so we have to pop until
- // at least a `Shr` is at the top of the stack.
- let shr_ref = &*rc; // tag gets changed to `Shr(Some(3))`
- Retag(shr_ref);
- // stack: [Uniq(0), Shr]; not frozen
-
- // Now using `mut_ref` is UB because its tag is no longer on the stack. But
- // that is bad, because it is usable in safe code.
- *mut_ref += 19;
-}
-{% endhighlight %}
-
-Notice how `mut_ref` and `shr_ref` alias! And yet, creating a shared reference
-to the memory already covered by our unique `mut_ref` must not invalidate
-`mut_ref`. If we follow the instructions above, when we retag `shr_ref` after
-it got created, we have no choice but pop the item matching `mut_ref` off the
-stack. Ouch.
-
-This made me realize that creating a shared reference has to be very weak when
-on locations inside `UnsafeCell`. In fact, it is entirely equivalent to
-`EscapeToRaw`: We just have to make sure some kind of shared access is possible,
-but we have to accept that there might be active mutable references assuming
-exclusive access to the same locations. That on its own is not enough, though.
-
-I also added a new check to the retagging procedure: Before taking any action
-(i.e., before step 3, which could pop items off the stack), we check if the
-reborrow is redundant: If the new reference we want to create is already
-dereferencable (because its item is already on the stack and, if applicable, the
-stack is already frozen), *and* if the item that justifies this is moreover
-"derived from" the item that corresponds to the old reference, then we just do
-nothing. Here, "derived from" means "further up the stack". Basically, the
-reborrow has already happened and the new reference is ready for use, and
-(because of that "derived from" check), we know that using the new reference
-will *not* pop the item corresponding to the old reference off the stack. In
-that case, we avoid popping anything, to keep other references valid.
-
-It may seem like this rule can never apply, because how can our fresh tag match
-something that's already on the stack? This is indeed impossible for `Uniq`
-tags, but for `Shr` tags, matching is more liberal. For example, this rule
-applies in our example above when we create `shr_ref` from `mut_ref`. We do not
-require freezing (because there is an `UnsafeCell`), there is already a `Shr` on
-the stack (so the new reference is dereferencable) and the item matching the old
-reference (`Uniq(0)`) is below that `Shr` (so after using the new reference, the
-old one remains dereferencable). Hence we do nothing, keeping the `Uniq(2)` on
-the stack, such that the access through `mut_ref` at the end remains valid.
-
-This may sound like a weird rule, and it is. I would surely not have thought of
-this if `RefCell` would not force our hands here. However, as we shall see in
-[section 5], it also does not to break any of the important properties of the
-model (mutable references being unique and shared references being immutable
-except for `UnsafeCell`). Moreover, when pushing an item to the stack (at the
-end of the retag action), we can be sure that the stack is not yet frozen: if it
-were frozen, the reborrow would be redundant.
-
-With this extension, the instructions for retagging and `EscapeToRaw` now look
-as follows (again executed on all locations covered by the reference, according
-to `size_of_val`):
-
-1. Compute a fresh tag: `Uniq(_)` for a mutable reference, `Shr(Some(_))` for a
- shared reference, `Shr(None)` if this is `EscapeToRaw`.
-2. Perform the checks that would also happen when we dereference this reference.
- Remember the position of the item matching the tag in the stack.
-3. Redundancy check: If the new tag passes the checks performed on a
- dereference, and if the item that makes this check succeed is *above* the one
- we remembered in step 2 (where the "frozen" state is considered above every
- item in the stack), then we stop. We are done for this location.
-4. Perform the actions that would also happen when an actual access happens
- through this reference (for shared references a read access, for mutable
- references a write access).
- Now the location cannot be frozen any more: If the fresh tag is `Uniq`, we
- just unfroze, if the fresh tag is `Shr` and the location was already frozen
- then the redundancy check (step 3) would have kicked in.
-5. If the new tag is `Uniq`, push it onto the stack.
-6. If the new tag is `Shr`, push a `Shr` item to the stack. Then, if the
- location is outside of `UnsafeCell`, it gets frozen with the timestamp of the
- new reference.
-
-The one thing I find slightly unsatisfying about the redundancy check is that it
-seems to overlap a bit with the rule that on a *read* access, a `Shr` item
-matches a `Uniq` tag. Both of these together enable the read-only use of
-mutable references that have already been shared; I would prefer to have a
-single condition enabling that instead of two working together.
-
-## 4 Differences to the Original Proposal
-[section 4]: #4-differences-to-the-original-proposal
-
-The key differences to the original proposal is that the check performed on a
-dereference, and the check performed on an access, are not the same check. This
-means there are more "moving parts" in the model, but it also means we do not
-need a weird special exception for `demo1` any more like the original proposal
-did. The main reason for this, however, is that on an access, we just do not
-know if we are inside an `UnsafeCell` or not, so we cannot do all the checks we
-would like to do. Accordingly, I also rearranged terminology a bit. There is
-no longer one "reactivation" action, instead there is a "deref" check and an
-"access" action, as described above in sections [2.2][section 2.2] and
-[2.3][section 2.3].
-
-Beyond that, I made the behavior of shared references and raw pointers more
-uniform. This helped to fix test failures around `iter_mut` on slices, which
-first creates a raw reference and then a shared reference: In the original
-model, creating the shared reference invalidates previously created raw
-pointers. As a result of the more uniform treatment, this no longer happens.
-(Coincidentally, I did not make this change with the intention of fixing
-`iter_mut`. I did this change because I wanted to reduce the number of case
-distinctions in the model. Then I realized the relevant test suddenly passed
-even with the full model enabled, investigated what happened, and realized I
-accidentally had had a great idea. :D )
-
-The tag is now "typed" (`Uniq` vs `Shr`) to be able to support `transmute`
-between references and shared pointers. Such `transmute` were an open question
-in the original model and some people raised concerns about it in the ensuing
-discussion. I invite all of you to come up with strange things you think you
-should be able to `transmute` and throw them at miri so that we can see if your
-use-cases are covered. :)
-
-Creating a shared reference now always pushes a `Shr` item onto the stack, even
-when there is no `UnsafeCell`. This means that starting with a mutable reference
-`x`, `&*x as *const _ as *mut _` is pretty much equivalent to `x as *mut`. This
-came up during the implementation because I realized that in `x as *const _` on
-a mutable reference, `x` actually first gets coerced to shared reference, which
-then gets cast to a raw pointer. This happens in `NonNull::from`, so if you
-later write to that `NonNull`, you end up writing to a raw pointer that was
-created from a shared reference. Originally I intended this to be strictly
-illegal. This is writing to a shared reference after all, how dare you!
-However, it turns out it's actually no big deal *if the shared reference does
-not get used again later*. This is an access-based model after all, if a
-reference never gets used again we do not care much about enforcing any
-guarantees for it. (This is another example of a coincidental fix, where I had
-a surprisingly passing test case and then investigated what happened.)
-
-The redundancy check during retagging can be seen as refining a similar check
-that the original model did whenever a new reference was created (where we
-wouldn't change the state if the new borrow is already active).
-
-Finally, the notion of "function barriers" from the original Stacked Borrows has
-not been implemented yet. This is the next item on my todo list.
-
-## 5 Key Properties
-[section 5]: #5-key-properties
-
-Let us look at the two key properties that I set out as design goals, and see
-how the model guarantees that they hold true in all valid (UB-free) executions.
-
-### 5.1 Mutable References are Unique
-
-The property I would like to establish here is that: After creating (retagging,
-really) a `&mut`, if we then run some unknown code *that does not get passed the
-reference* nor do we derive another reference from ours, and then we use the
-reference again (reading or writing), we can be sure that this unknown code did
-not access the memory behind our mutable reference at all (or we have UB). For
-example:
-
-{% highlight rust %}
-fn demo_mut_unique(our: &mut i32) -> i32 {
- Retag(our); // So we can be sure the tag is unique
-
- *our = 5;
-
- unknown_code();
-
- // We know this will return 5, and moreover if `unknown_code` does not panic
- // we know we could do the write after calling `unknown_code` (because it
- // cannot even read from `our`).
- *our
-}
-{% endhighlight %}
-
-The proof sketch goes as follows: After retagging the reference, we know it is
-at the top of the stack and the location is not frozen. (The "redundant
-reborrow" rule does not apply because a fresh `Uniq` tag can never be
-redundant.) For any access performed by the unknown code, we know that access
-cannot use the tag of our reference because the tags are unique and not
-forgeable. Hence if the unknown code accesses our locations, that would pop our
-tag from the stack. When we use our reference again, we know it is on the
-stack, and hence has not been popped off. Thus there cannot have been an access
-from the unknown code.
-
-Actually this theorem applies *any time* we have a reference whose tag we can be
-sure has not been leaked to anyone else, and which points to locations which
-have this tag at the top of the (unfrozen) stack. This is not just the case
-immediately after retagging. We know our reference is at the top of the stack
-after writing to it, so in the following example we know that `unknown_code_2`
-cannot access `our`:
-
-{% highlight rust %}
-fn demo_mut_advanced_unique(our: &mut u8) -> u8 {
- Retag(our); // So we can be sure the tag is unique
-
- unknown_code_1(&*our);
-
- // This "re-asserts" uniqueness of the reference: After writing, we know
- // our tag is at the top of the stack.
- *our = 5;
-
- unknown_code_2();
-
- // We know this will return 5
- *our
-}
-{% endhighlight %}
-
-### 5.2 Shared References (without `UnsafeCell)` are Immutable
-
-The key property of shared references is that: After creating (retagging,
-really) a shared reference, if we then run some unknown code (it can even have
-our reference if it wants), and then we use the reference again, we know that
-the value pointed to by the reference has not been changed. For example:
-
-{% highlight rust %}
-fn demo_shr_frozen(our: &u8) -> u8 {
- Retag(our); // So we can be sure the tag actually carries a timestamp
-
- // See what's in there.
- let val = *our;
-
- unknown_code(our);
-
- // We know this will return `val`
- *our
-}
-{% endhighlight %}
-
-The proof sketch goes as follows: After retagging the reference, we know the
-location is frozen (this is the case even if the "redundant reborrow" rule
-applies). If the unknown code does any write, we know this will unfreeze the
-location. The location might get re-frozen, but only at the then-current
-timestamp. When we do our read after coming back from the unknown code, this
-checks that the location is frozen *at least* since the timestamp given in its
-tag, so if the location is unfrozen or got re-frozen by the unknown code, the
-check would fail. Thus the unknown code cannot have written to the location.
-
-One interesting observation here for both of these proofs is that all we rely on
-when the unknown code is executed are the actions performed on every memory
-access. The additional checks that happen when a pointer is dereferenced only
-matter in *our* code, not in the foreign code. This indicates that we could see
-the checks on pointer dereference as another "shadow state operation" next to
-`Retag` and `EscapeToRaw`, and then these three operations plus the actions on
-memory accesses are all that there is to Stacked Borrows. This is difficult to
-implement in miri because dereferences can happen any time a path is evaluated,
-but it is nevertheless interesting and might be useful in a "lower-level MIR"
-that does not permit dereferences in paths.
-
-## 6 Evaluation, and How You Can Help
-[section 6]: #6-evaluation-and-how-you-can-help
-
-I have implemented both the validity invariant and the model as described above
-in miri. This [uncovered](https://github.com/rust-lang/rust/issues/54908) two
-[issues](https://github.com/rust-lang/rust/issues/54957) in the standard
-library, but both were related to validity invariants, not Stacked Borrows.
-With these exceptions, the model passes the entire test suite. There were some
-more test failures in earlier versions (as mentioned in [section 4]), but the
-final model accepts all the code covered by miri's test suite. Moreover I wrote
-a bunch of compile-fail tests to make sure the model catches various violations
-of the key properties it should ensure.
-
-However, miri's test suite is tiny, and I have but one brain to come up with
-counterexamples! In fact I am quite a bit worried because I literally came up
-with `demo_refcell` less than two weeks ago, so what else might I have missed?
-This where you come in. Please test this model! Come up with something funny
-you think should work (I am thinking about funny `transmute` in particular,
-using type punning through unions or raw pointers if you prefer that), or maybe
-you have some crate that has some unsafe code and a test suite (you do have a
-test suite, right?) that might run under miri.
-
-The easiest way to try the model is the
-[playground](https://play.rust-lang.org/): Type the code, select "Tools - Miri",
-and you'll see what it does.
-
-For things that are too long for the playground, you have to install miri on
-your own computer. miri depends on rustc nightly and has to be updated
-regularly to keep working, so it is not currently on crates.io. Installation
-instructions for miri are provided
-[in the README](https://github.com/solson/miri/#running-miri). Please let me
-know if you are having trouble with anything. You can report issues, comment on
-this post or find me in chat (as of recently, I am partial to Zulip where we
-have an
-[unsafe code guidelines stream](https://rust-lang.zulipchat.com/#narrow/stream/136281-wg-unsafe-code-guidelines)).
-
-With miri installed, you can `cargo miri` a project with a binary to run it in
-miri. Dependencies should be fully supported, so you can use any crate you
-like. It is not unlikely, however, that you will run into issues because miri
-does not support some operation. In that case please search the
-[issue tracker](https://github.com/solson/miri/issues) and report the issue if
-it is new. We cannot support everything, but we might be able to do something
-for your case.
-
-Unfortunately, `cargo miri test` is currently broken; if you want to help with
-that [here are some details](https://github.com/solson/miri/issues/479).
-Moreover, wouldn't it be nice if we could run the entire libcore, liballoc and
-libstd test suite in miri? There are tons of interesting cases of Rust's core
-data structures being exercise there, and the comparatively tiny miri test suite
-has already helped to find two soundness bugs, so there are probably more. Once
-`cargo miri test` works again, it would be great to find a way to run it on the
-standard library test suites, and set up something so that this happens
-automatically on a regular basis (so that we notice regressions).
-
-As you can see, there is more than enough work for everyone. Don't be shy! I
-have a mere three weeks left on this internship, after which I will have to
-significantly reduce my Rust activities in favor of finishing my PhD. I won't
-disappear entirely though, don't worry -- I will still be able to mentor you if
-you want to help with any of the above tasks. :)
-
-Thanks to @nikomatsakis for feedback on a draft of this post.
-<!-- If you want to help or report results of your experiments, if you have any questions or comments, please join the [discussion in the forums](). -->