---
title: "Barriers and Two-phase Borrows in Stacked Borrows"
categories: internship rust
forum: https://internals.rust-lang.org/t/barriers-and-two-phase-borrows-in-stacked-borrows/9100
---

My internship ("research assistantship") with Mozilla has ended several weeks ago, and this post is a report of the most recent tweaks I made to Miri and Stacked Borrows.
Neither project is by any means "done", of course.
However, both have reached a fairly reasonable state, so I felt some kind of closing report made sense.
Also, if I ever want to finish my PhD, I'll have to seriously scale down the amount of time I work on Rust -- so at least from my side, things will move more slowly from now on.

In particular, installing Miri and running your test suite in it is now just a single command away!
Scroll all the way down if you are not interested in the rest.

<!-- MORE -->

## Stacked Borrows Tweaks

I made some small tweaks to Stacked Borrows which mostly serve to make things overall more consistent.
All of these have been incorporated into my [previous post on Stacked Borrows]({% post_url 2018-11-16-stacked-borrows-implementation %}).

The only noticeable functional change affects creating a shared reference: previously, that would push a `Shr` item to the stack before freezing the location.
That would, however, permit mutation like in the following example:

{% highlight rust %}
fn demo_shared_mutation(ptr: &mut u8) {
    let shared = &*ptr; // pushes a `Shr` item
    let raw = shared as *const u8 as *mut u8;
    unsafe { *raw = 5; } // allowed: unfreezing, then matching the `Shr` item
}
{% endhighlight %}

Since "no mutation through (pointers derived from) shared references" has been a rule in Rust pretty much from the beginning, I changed creating a shared reference to no longer unconditionally push a `Shr` item.
That now only happens when the pointee is mutable because of an `UnsafeCell`.
As a consequence, the code above is now forbidden.

## Barriers

Beyond these small tweaks, I implemented the *barriers* that I already described in my [original proposal]({% post_url 2018-08-07-stacked-borrows %}).
The purpose of barriers is to handle code like the following:

{% highlight rust %}
fn aliasing(ptr: &mut u8, raw: *mut u8) -> u8 {
    let val = *ptr;
    unsafe { *raw = 4; }
    val // we would like to move the `ptr` load down here
}

fn demo_barrier() -> u8 {
    let x = &mut 8u8;
    let raw = x as *mut u8;
    aliasing(unsafe { &mut *raw }, raw)
}
{% endhighlight %}

`aliasing` here gets two aliasing pointers as argument.
This means writing to `raw` changes the value in `ptr`, so the desired optimization would change the behavior of `demo_barrier` (currently it returns 8, after moving the load of `ptr` down it returns 4).
Stacked Borrows without barriers would allow this code because `ptr` does not get used again after `raw`.

To disallow this code and enable the desired optimization, we introduce a new kind of item that can be pushed to the per-location stack: a *function barrier* referring to a particular function call.
While the function is running, the barrier may not be popped.
The type of items on the borrow stack therefore now looks as follows:

{% highlight rust %}
pub enum BorStackItem {
    Uniq(Timestamp),
    Shr,
    FnBarrier(CallId),
}
{% endhighlight %}

These barriers get pushed when doing the initial retagging of a function (remember that we insert `Retag` statements for all arguments at the beginning of each function).
We do this only for references (not for `Box`), and we do *not* push a barrier to locations inside a shared `UnsafeCell`.
Concretely, between steps 4 and 5 of retagging (after we performed the action of an access and before we push the new item), we push a `FnBarrier` referring to the current function call.
Moreover, when retagging with barriers, we do not perform the usual redundancy check -- even if the reborrow is redundant, pushing the barrier might be necessary.

Barriers have no effect when a pointer gets dereferenced: looking for the item in the stack just skips over barriers.
However, when performing a memory access and popping items from the stack, we stop when we encounter an active barrier, i.e., a barrier for a function call that is still ongoing.
If that ever happens, we have undefined behavior.

Together, these rules make the above example invalid code, and hence enable our optimization: when doing the memory access in `*raw = 4`, we pop until we find a `Shr` item.
That will hit the barrier that was added when `aliasing` did the initial retagging of `val`, triggering UB.

An interesting (and positive!) consequence of barriers is that they make the following code UB:

{% highlight rust %}
fn aliasing(ptr1: &mut u8, ptr2: &mut u8) {}

fn demo_barrier() -> u8 {
    let x = &mut 8u8;
    let raw = x as *mut u8;
    aliasing(unsafe { &mut *raw }, x)
}
{% endhighlight %}

Without barriers, there is no problem with this code because `aliasing` does not actually *use* the two aliasing pointers in conflicting ways.
With barriers, the initial retag of `ptr2` goes wrong because it tries to pop off the barrier that just got pushed for `ptr1`.
That's great, because we don't actually want to allow passing two aliasing pointers to `aliasing`.

Barriers have one more effect: when memory gets deallocated, it must not have any active barriers left in its stacks.
Barriers so far ensure that a reference passed to a function does not get *invalidated* during that function call; this extra check ensures that it also does not get *deallocated*.
This is important, because we tell LLVM via the `dereferencable` attribute that the reference is, well, dereferencable for the duration of the function call.
To justify this attribute, we have to argue that the program really cannot deallocate this memory


### Intuition and Design Decisions

One high-level intuition for barriers is that they encode the rule that a reference passed to a function outlives the function call.
The item matching that reference can never get popped while the function is still being executed: if it would get popped, we would hit the barrier and trigger UB.
This is also why we don't have barriers for `Box`: they don't have any such relationship to function calls.

The other exception, about not having barriers for shared `UnsafeCell`, is more subtle.
In my [previous post]({% post_url 2018-11-16-stacked-borrows-implementation %}), I had an example of a function where a shared and a mutable reference alias (the shared reference being `&RefCell` and the mutable reference pointing into it).
This example crucially relies on the redundancy check during retagging, where the shared reference does *not* get reborrowed (which would invalidate the mutable reference).
If we wanted to add a barrier for that shared reference, we could not push it in top of the stack; we would have to add it somewhere in the middle next to the `Shr` item matching this reference.
For now, I am trying to avoid adding items to the middle of the stack.

Another problem with shared `UnsafeCell` and barriers is summarized in [this issue](https://github.com/rust-lang/rust/issues/55005): in `Arc`, it can actually be the case that the memory referenced by a shared reference (but inside an `UnsafeCell`) gets deallocated while the function (that got this shared reference as argument) is still running.
This violated LLVM's expectations for `dereferencable` pointers, and it would also violate the rule that memory with active barriers cannot be deallocated.
We will have to either declare the current implementation of `Arc` incorrect or change the attributes that we use with LLVM.
The model currently takes the position that `Arc` is correct, deallocating data behind a shared `UnsafeCell` is allowed, and the attribute is emitted incorrectly.

## Two-phase Borrows

During the RustFest Rome, I was reminded that Stacked Borrows does not currently support [two-phase borrows](http://smallcultfollowing.com/babysteps/blog/2017/03/01/nested-method-calls-via-two-phase-borrowing/).
This means that the following code (which is safe code in the 2018 edition) would get rejected by Miri:
{% highlight rust %}
fn two_phase() {
    let mut v = vec![];
    v.push(v.len());
}
{% endhighlight %}

To fix that, I just had to make a very tiny change.
Retagging now has a special mode for two-phase borrows (it would be used in the above example when retagging the temporary containing the first argument of `push`), and in that mode, after completing all the usual steps, we create a shared reborrow of the newly created mutable borrow, with all the usual effects (freezing or pushing a `Shr` onto the stack.
Thanks to the rule that reading through a mutable reference does not unfreeze or pop a `Shr` off the stack, this means that the original `v` can be used for reading until the time that the newly created (two-phase) borrow is written to for the first time.
Writing will unfreeze and pop the `Shr` off the stack, disallowing any further reads through other references.
This corresponds to the "activation" of a two-phase borrow.

Unfortunately, this model is not quite enough to handle all the code that rustc currently accepts.
See [this issue](https://github.com/rust-lang/rust/issues/56254) for further discussion on this subject.

## Stacked Borrows Specification

Between the various changes described here, there is now a noticeable discrepancy between the only self-contained description of Stacked Borrows in my previous blog post(s) and what is actually implemented.
I will try to find a good spot to put a "living description" of whatever version of Stacked Borrows is currently implemented in Miri, so that there is more than just the code to look at.
I'll let you know when I found one!

## Making Miri Easier to Use

Finally, I decided to spend some time making Miri easier to install and use and fixing the problems with running test suites.
Now all you have to do is run

```
cargo +nightly install --force --git https://github.com/solson/miri miri
```

and then go to your project directory and run `cargo +nightly miri test` (you may have to `cargo clean` first to make sure all your dependencies are built the right way for Miri).
On first launch, Miri will (with your consent) install xargo and prepare a libstd suited for executing Rust programs in the interpreter, and then it will run your test suite.
@oli-obk is currently working on making this even easier by making Miri a component that you can install via `rustup`.
Give it a try and let us know what you think!
