X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/8f86a17ac63242f48437c48eb8b1ce1c6653d921..refs/heads/master:/ralf/_posts/2018-12-26-stacked-borrows-barriers.md diff --git a/ralf/_posts/2018-12-26-stacked-borrows-barriers.md b/ralf/_posts/2018-12-26-stacked-borrows-barriers.md deleted file mode 100644 index e68663d..0000000 --- a/ralf/_posts/2018-12-26-stacked-borrows-barriers.md +++ /dev/null @@ -1,160 +0,0 @@ ---- -title: "Barriers and Two-phase Borrows in Stacked Borrows" -categories: internship rust ---- - -My internship ("research assistantship") with Mozilla has ended several weeks ago, and this post is a report of the most recent tweaks I made to Miri and Stacked Borrows. -Neither project is by any means "done", of course. -However, both have reached a fairly reasonable state, so I felt some kind of closing report made sense. -Also, if I ever want to finish my PhD, I'll have to seriously scale down the amount of time I work on Rust -- so at least from my side, things will move more slowly from now on. - -In particular, installing Miri and running your test suite in it is now just a single command away! -Scroll all the way down if you are not interested in the rest. - - - -## Stacked Borrows Tweaks - -I made some small tweaks to Stacked Borrows which mostly serve to make things overall more consistent. -All of these have been incorporated into my [previous post on Stacked Borrows]({% post_url 2018-11-16-stacked-borrows-implementation %}). - -The only noticeable functional change affects creating a shared reference: previously, that would push a `Shr` item to the stack before freezing the location. -That would, however, permit mutation like in the following example: - -{% highlight rust %} -fn demo_shared_mutation(ptr: &mut u8) { - let shared = &*ptr; // pushes a `Shr` item - let raw = shared as *const u8 as *mut u8; - unsafe { *raw = 5; } // allowed: unfreezing, then matching the `Shr` item -} -{% endhighlight %} - -Since "no mutation through (pointers derived from) shared references" has been a rule in Rust pretty much from the beginning, I changed creating a shared reference to no longer unconditionally push a `Shr` item. -That now only happens when the pointee is mutable because of an `UnsafeCell`. -As a consequence, the code above is now forbidden. - -## Barriers - -Beyond these small tweaks, I implemented the *barriers* that I already described in my [original proposal]({% post_url 2018-08-07-stacked-borrows %}). -The purpose of barriers is to handle code like the following: - -{% highlight rust %} -fn aliasing(ptr: &mut u8, raw: *mut u8) -> u8 { - let val = *ptr; - unsafe { *raw = 4; } - val // we would like to move the `ptr` load down here -} - -fn demo_barrier() -> u8 { - let x = &mut 8u8; - let raw = x as *mut u8; - aliasing(unsafe { &mut *raw }, raw) -} -{% endhighlight %} - -`aliasing` here gets two aliasing pointers as argument. -This means writing to `raw` changes the value in `ptr`, so the desired optimization would change the behavior of `demo_barrier` (currently it returns 8, after moving the load of `ptr` down it returns 4). -Stacked Borrows without barriers would allow this code because `ptr` does not get used again after `raw`. - -To disallow this code and enable the desired optimization, we introduce a new kind of item that can be pushed to the per-location stack: a *function barrier* referring to a particular function call. -While the function is running, the barrier may not be popped. -The type of items on the borrow stack therefore now looks as follows: - -{% highlight rust %} -pub enum BorStackItem { - Uniq(Timestamp), - Shr, - FnBarrier(CallId), -} -{% endhighlight %} - -These barriers get pushed when doing the initial retagging of a function (remember that we insert `Retag` statements for all arguments at the beginning of each function). -We do this only for references (not for `Box`), and we do *not* push a barrier to locations inside a shared `UnsafeCell`. -Concretely, between steps 4 and 5 of retagging (after we performed the action of an access and before we push the new item), we push a `FnBarrier` referring to the current function call. -Moreover, when retagging with barriers, we do not perform the usual redundancy check -- even if the reborrow is redundant, pushing the barrier might be necessary. - -Barriers have no effect when a pointer gets dereferenced: looking for the item in the stack just skips over barriers. -However, when performing a memory access and popping items from the stack, we stop when we encounter an active barrier, i.e., a barrier for a function call that is still ongoing. -If that ever happens, we have undefined behavior. - -Together, these rules make the above example invalid code, and hence enable our optimization: when doing the memory access in `*raw = 4`, we pop until we find a `Shr` item. -That will hit the barrier that was added when `aliasing` did the initial retagging of `val`, triggering UB. - -An interesting (and positive!) consequence of barriers is that they make the following code UB: - -{% highlight rust %} -fn aliasing(ptr1: &mut u8, ptr2: &mut u8) {} - -fn demo_barrier() -> u8 { - let x = &mut 8u8; - let raw = x as *mut u8; - aliasing(unsafe { &mut *raw }, x) -} -{% endhighlight %} - -Without barriers, there is no problem with this code because `aliasing` does not actually *use* the two aliasing pointers in conflicting ways. -With barriers, the initial retag of `ptr2` goes wrong because it tries to pop off the barrier that just got pushed for `ptr1`. -That's great, because we don't actually want to allow passing two aliasing pointers to `aliasing`. - -Barriers have one more effect: when memory gets deallocated, it must not have any active barriers left in its stacks. -Barriers so far ensure that a reference passed to a function does not get *invalidated* during that function call; this extra check ensures that it also does not get *deallocated*. -This is important, because we tell LLVM via the `dereferencable` attribute that the reference is, well, dereferencable for the duration of the function call. -To justify this attribute, we have to argue that the program really cannot deallocate this memory - - -### Intuition and Design Decisions - -One high-level intuition for barriers is that they encode the rule that a reference passed to a function outlives the function call. -The item matching that reference can never get popped while the function is still being executed: if it would get popped, we would hit the barrier and trigger UB. -This is also why we don't have barriers for `Box`: they don't have any such relationship to function calls. - -The other exception, about not having barriers for shared `UnsafeCell`, is more subtle. -In my [previous post]({% post_url 2018-11-16-stacked-borrows-implementation %}), I had an example of a function where a shared and a mutable reference alias (the shared reference being `&RefCell` and the mutable reference pointing into it). -This example crucially relies on the redundancy check during retagging, where the shared reference does *not* get reborrowed (which would invalidate the mutable reference). -If we wanted to add a barrier for that shared reference, we could not push it in top of the stack; we would have to add it somewhere in the middle next to the `Shr` item matching this reference. -For now, I am trying to avoid adding items to the middle of the stack. - -Another problem with shared `UnsafeCell` and barriers is summarized in [this issue](https://github.com/rust-lang/rust/issues/55005): in `Arc`, it can actually be the case that the memory referenced by a shared reference (but inside an `UnsafeCell`) gets deallocated while the function (that got this shared reference as argument) is still running. -This violated LLVM's expectations for `dereferencable` pointers, and it would also violate the rule that memory with active barriers cannot be deallocated. -We will have to either declare the current implementation of `Arc` incorrect or change the attributes that we use with LLVM. -The model currently takes the position that `Arc` is correct, deallocating data behind a shared `UnsafeCell` is allowed, and the attribute is emitted incorrectly. - -## Two-phase Borrows - -During the RustFest Rome, I was reminded that Stacked Borrows does not currently support [two-phase borrows](http://smallcultfollowing.com/babysteps/blog/2017/03/01/nested-method-calls-via-two-phase-borrowing/). -This means that the following code (which is safe code in the 2018 edition) would get rejected by Miri: -{% highlight rust %} -fn two_phase() { - let mut v = vec![]; - v.push(v.len()); -} -{% endhighlight %} - -To fix that, I just had to make a very tiny change. -Retagging now has a special mode for two-phase borrows (it would be used in the above example when retagging the temporary containing the first argument of `push`), and in that mode, after completing all the usual steps, we create a shared reborrow of the newly created mutable borrow, with all the usual effects (freezing or pushing a `Shr` onto the stack. -Thanks to the rule that reading through a mutable reference does not unfreeze or pop a `Shr` off the stack, this means that the original `v` can be used for reading until the time that the newly created (two-phase) borrow is written to for the first time. -Writing will unfreeze and pop the `Shr` off the stack, disallowing any further reads through other references. -This corresponds to the "activation" of a two-phase borrow. - -Unfortunately, this model is not quite enough to handle all the code that rustc currently accepts. -See [this issue](https://github.com/rust-lang/rust/issues/56254) for further discussion on this subject. - -## Stacked Borrows Specification - -Between the various changes described here, there is now a noticeable discrepancy between the only self-contained description of Stacked Borrows in my previous blog post(s) and what is actually implemented. -I will try to find a good spot to put a "living description" of whatever version of Stacked Borrows is currently implemented in Miri, so that there is more than just the code to look at. -I'll let you know when I found one! - -## Making Miri Easier to Use - -Finally, I decided to spend some time making Miri easier to install and use and fixing the problems with running test suites. -Now all you have to do is run - -``` -cargo +nightly install --git https://github.com/solson/miri/ miri -``` - -and then go to your project directory and run `cargo +nightly miri test` (you may have to `cargo clean` first to make sure all your dependencies are built the right way for Miri). -On first launch, Miri will (with your consent) install xargo and prepare a libstd suited for executing Rust programs in the interpreter, and then it will run your test suite. -@oli-obk is currently working on making this even easier by making Miri a component that you can install via `rustup`. -Give it a try and let us know what you think!