ralf/_drafts/stacked-borrows-implementation.md

   1 ---
   2 title: "Stacked Borrows Implemented"
   3 categories: internship rust
   4 ---
   5
   6 Three months ago, I proposed [Stacked Borrows]({% post_url
   7 2018-08-07-stacked-borrows %}) as a model for defining what kinds of aliasing
   8 are allowed in Rust, and the idea of a [validity invariant]({% post_url
   9 2018-08-22-two-kinds-of-invariants %}) that has to be maintained by all code at
  10 all times.  Since then I have been busy implementing both of these, and
  11 developed Stacked Borrows further in doing so.  This post describes the latest
  12 version of Stacked Borrows, and reports my findings from the implementation
  13 phase: What worked, what did not, and what remains to be done.  There will also
  14 be an opportunity for you to help the effort!
  15
  16 <!-- MORE -->
  17
  18 What Stacked Borrows does is that it defines a semantics for Rust programs such
  19 that some things about references always hold true for every valid execution
  20 (meaning executions where no [undefined behavior]({% post_url
  21 2017-07-14-undefined-behavior %}) occurred): `&mut` references are unique (we
  22 can rely on no accesses by other functions happening to the memory they point
  23 to), and `&` references are read-only (we can rely on no writes happening to the
  24 memory they point to, unless there is an `UnsafeCell`).  Usually we have the
  25 borrow checker guarding us against such nefarious violations of reference type
  26 guarantees, but alas, when we are writing unsafe code, the borrow checker cannot
  27 help us.  We have to define a set of rules that makes sense even for unsafe
  28 code.
  29
  30 I will try to explain at least parts of this model again in this post.  The
  31 explanation is not going to be the same as last time, not only because it
  32 changed a bit, but also because I think I understand the model better myself
  33 now.
  34
  35 Ready?  Let's get started.  I hope you brought some time, because this is a
  36 rather lengthy post.  If you are not interested in a detailed description of
  37 Stacked Borrows, you can skip most of this post and go right to [section 4].  If
  38 you only want to know how to help, jump to [section 6].
  39
  40 ## 1 Enforcing Uniqueness
  41
  42 Let us first ignore the part about `&` references being read-only and focus on
  43 uniqueness of mutable references.  Namely, we want to define our model in a way
  44 that calling the following function will trigger undefined behavior:
  45
  46 {% highlight rust %}
  47 fn demo0() {
  48   let x = &mut 1u8;
  49   let y = &mut *x;
  50   *y = 5;
  51   // Write through a pointer aliasing `y`
  52   *x = 3;
  53   // Use `y` again, asserting it is still exclusive
  54   let _val = *y;
  55 }
  56 {% endhighlight %}
  57
  58 We want this function to be disallowed because between two uses of `y`, there is
  59 a use of another pointer for the same location, violating the fact that `y`
  60 should be unique.
  61
  62 Notice that this function does not compile, the borrow checker won't allow it.
  63 That's great!  It is undefined behavior, after all.  But the entire point of
  64 this exercise is to explain *why* we have undefined behavior here *without*
  65 referring to the borrow checker, because we want to have rules that also work
  66 for unsafe code.
  67
  68 To be able to do this, we have to pretend our machine has two thing which real
  69 CPUs do not have.  This is an example of adding "shadow state" or "instrumented
  70 state" to the "virtual machine" that we [use to specify Rust]({% post_url
  71 2017-06-06-MIR-semantics %}).  This is not an uncommon approach, often times
  72 source languages make distinctions that do not appear in the actual hardware.  A
  73 related example is
  74 [valgrind's memcheck](http://valgrind.org/docs/manual/mc-manual.html) which
  75 keeps track of which memory is initialized to be able to detect memory errors:
  76 During a normal execution, uninitialized memory looks just like all other
  77 memory, but to figure out whether the program is violating C's memory rules, we
  78 have to keep track of some extra state.
  79
  80 For stacked borrows, the extra state looks as follows:
  81
  82 1. For every pointer, we keep track of an extra "tag" that records when and how
  83    this pointer was created.
  84 2. For every location in memory, we keep track of a stack of tags, indicating
  85    which tag a pointer must have to be allowed to access this location.
  86
  87 These exist separately, i.e., when a pointer is stored in memory, then we both
  88 have a tag stored as part of this pointer value, and every byte occupied by the
  89 pointer has a stack regulating access to this location.  Remember,
  90 [a byte is more than a `u8`]({% post_url 2018-07-24-pointers-and-bytes %}).
  91 Also these two do not interact, i.e., when loading a pointer from memory, we
  92 just load the tag that was stored as part of this pointer.  The stack of a
  93 location, and the tag of a pointer stored at some location, do not have any
  94 effect on each other.
  95
  96 In our example, there are two pointers (`x` and `y`) and one location of
  97 interest (the one both of these pointers point to, initialized with `1u8`).
  98 When we initially create `x`, it gets tagged `Uniq(0)` to indicate that it is a
  99 unique reference, and the location's stack has `Uniq(0)` at its top to indicate
 100 that this is the latest reference allowed to access said location.  When we
 101 create `y`, it gets a new tag, `Uniq(1)`, so that we can distinguish it from
 102 `x`.  We also push `Uniq(1)` onto the stack, indicating not only that `Uniq(1)`
 103 is the latest reference allow to access, but also that it is "derived from"
 104 `Uniq(0)`: The tags higher up in the stack are descendants of the ones further
 105 down.
 106
 107 So we have: `x` tagged `Uniq(0)`, `y` tagged `Uniq(1)`, and the stack contains
 108 `[Uniq(0), Uniq(1)]`. (Top of the stack is on the right.)
 109
 110 When we use `y` to access the location, we make sure its tag is at the top of
 111 the stack: Check, no problem here.  When we use `x`, we do the same thing: Since
 112 it is not at the top yet, we pop the stack until it is, which is easy.  Now the
 113 stack is just `[Uniq(0)]`.  Now we use `y` again and... blast!  Its tag is not
 114 on the stack.  We have undefined behavior.
 115
 116 In case you got lost, here is the source code with comments indicating the tags
 117 and the stack of the one location that interests us:
 118
 119 {% highlight rust %}
 120 fn demo0() {
 121   let x = &mut 1u8; // tag: `Uniq(0)`
 122   // stack: [Uniq(0)]
 123
 124   let y = &mut *x; // tag: `Uniq(1)`
 125   // stack: [Uniq(0), Uniq(1)]
 126
 127   // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
 128   // Nothing changes.
 129   *y = 5;
 130   // stack: [Uniq(0), Uniq(1)]
 131
 132   // Pop until `Uniq(0)`, the tag of `x`, is on top of the stack:
 133   // We pop `Uniq(1)`.
 134   *x = 3;
 135   // stack: [Uniq(0)]
 136
 137   // Pop until `Uniq(1)`, the tag of `y`, is on top of the stack:
 138   // That is not possible, hence we have undefined behavior.
 139   let _val = *y;
 140 }
 141 {% endhighlight %}
 142
 143 Well, actually having undefined behavior here is good news, since that's what we
 144 wanted from the start!  And since there is an implementation of the model in
 145 [miri](https://github.com/solson/miri/), you can try this yourself: The amazing
 146 @shepmaster has integrated miri into the playground, so you can
 147 [put the example there](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=d15868687f79072688a0d0dd1e053721)
 148 (adjusting it slightly to circumvent the borrow checker), then select "Tools -
 149 Miri" and it will complain (together with a rather unreadable backtrace, we sure
 150 have to improve that one):
 151
 152 ```
 153  --> src/main.rs:6:14
 154   |
 155 6 |   let _val = *y;
 156   |              ^^ Encountered reference with non-reactivatable tag: Borrow-to-reactivate Uniq(1245) does not exist on the stack
 157   |
 158 ```
 159
 160 ## 2 Enabling Sharing
 161
 162 If we just had unique pointers, Rust would be a rather dull language.  Lucky
 163 enough, there are also two ways to have shared access to a location: Through
 164 shared references (safely), and through raw pointers (unsafely).  Moreover,
 165 shared references *sometimes* (but not when they point to an `UnsafeCell`)
 166 assert an additional guarantee: Their destination is read-only.
 167
 168 For example, we want the following code to be allowed -- not least because this
 169 is actually safe code accepted by the borrow checker, so we better make sure
 170 this is not undefined behavior:
 171
 172 {% highlight rust %}
 173 fn demo1() {
 174   let x = &mut 1u8;
 175   // Create several shared references, and we can also still read from `x`
 176   let y1 = &*x;
 177   let _val = *x;
 178   let y2 = &*x;
 179   let _val = *y1;
 180   let _val = *y2;
 181 }
 182 {% endhighlight %}
 183
 184 However, the following code is *not* okay:
 185
 186 {% highlight rust %}
 187 fn demo2() {
 188   let x = &mut 1u8;
 189   let y = &*x;
 190   // Create raw reference aliasing `y` and write through it
 191   let z = x as *const u8 as *mut u8;
 192   unsafe { *z = 3; }
 193   // Use `y` again, asserting it still points to the same value
 194   let _val = *y;
 195 }
 196 {% endhighlight %}
 197
 198 If you
 199 [try this in miri](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=1bc8c2f432941d02246fea0808e2e4f4),
 200 you will see it complain:
 201
 202 ```
 203  --> src/main.rs:6:14
 204   |
 205 6 |   let _val = *y;
 206   |              ^^ Location is not frozen long enough
 207   |
 208 ```
 209
 210 How is it doing that, and what is a "frozen" location?
 211
 212 To explain this, we have to extend the "shadow state" of our "virtual machine" a
 213 bit.  First of all, we introduce a new kind of tag that a pointer can carry: A
 214 *shared* tag.  The following Rust type describes the possible tags of a pointer:
 215
 216 {% highlight rust %}
 217 pub type Timestamp = u64;
 218 pub enum Borrow {
 219     Uniq(Timestamp),
 220     Shr(Option<Timestamp>),
 221 }
 222 {% endhighlight %}
 223
 224 You can think of the timestamp as a unique ID, but as we will see, for shared
 225 references, it is also important to be able to determine which of these IDs was
 226 created first.  The timestamp is optional in the shared tag because that tag is
 227 also used by raw pointers, and for raw pointers, we are often not able to track
 228 when and how they are created (for example, when raw pointers are converted to
 229 integers and back).
 230
 231 We use a separate type for the items on our stack, because there we do not need
 232 a timestamp for shared pointers:
 233
 234 {% highlight rust %}
 235 pub enum BorStackItem {
 236     Uniq(Timestamp),
 237     Shr,
 238 }
 239 {% endhighlight %}
 240
 241 And finally, a "borrow stack" consists of a stack of `BorStackItem`, together
 242 with an indication of whether the stack (and the location it governs) is
 243 currently *frozen*, meaning it may only be read, not written:
 244
 245 {% highlight rust %}
 246 pub struct Stack {
 247     borrows: Vec<BorStackItem>, // used as a stack; never empty
 248     frozen_since: Option<Timestamp>, // virtual frozen "item" on top of the stack
 249 }
 250 {% endhighlight %}
 251
 252 ### 2.1 Executing the Examples
 253
 254 Let us now look at what happens when we execute our two example programs.  To
 255 this end, I will embed comments in the source code.  There is only one location
 256 of interest here, so whenever I talk about a "stack", I am referring to the
 257 stack of that location.
 258
 259 {% highlight rust %}
 260 fn demo1() {
 261   let x = &mut 1u8; // tag: `Uniq(0)`
 262   // stack: [Uniq(0)]; not frozen
 263
 264   let y1 = &*x; // tag: `Shr(Some(1))`
 265   // stack: [Uniq(0), Shr]; frozen since 1
 266
 267   // Access through `x`.  We first check whether its tag `Uniq(0)` is in the
 268   // stack (it is).  Next, we make sure that either our item *or* `Shr` is on
 269   // top *or* the location is frozen.  The latter is the case, so we go on.
 270   let _val = *x;
 271   // stack: [Uniq(0), Shr]; frozen since 1
 272
 273   // This is not an access, but we still dereference `x`, so we do the same
 274   // actions as on a read.  Just like in the previous line, nothing happens.
 275   let y2 = &*x; // tag: `Shr(Some(2))`
 276   // stack: [Uniq(0), Shr]; frozen since 1
 277
 278   // Access through `y1`.  Since the shared tag has a timestamp (1) and the type
 279   // (`u8`) does not allow interior mutability (no `UnsafeCell`), we check that
 280   // the location is frozen since (at least) that timestamp.  It is.
 281   let _val = *y1;
 282   // stack: [Uniq(0), Shr]; frozen since 1
 283
 284   // Same as with `y2`: The location is frozen at least since 2 (actually, it
 285   // is frozen since 1), so we are good.
 286   let _val = *y2;
 287   // stack: [Uniq(0), Shr]; frozen since 1
 288 }
 289 {% endhighlight %}
 290
 291 This example demonstrates a few new aspects.  First of all, there are actually
 292 two operations that perform tag-related checks in this model (so far):
 293 Dereferencing a pointer (whether you have a `*`, also implicitly), and actual
 294 memory accesses.  Operations like `&*x` are an example of operations that
 295 dereference a pointer without accessing memory.  Secondly, *reading* through a
 296 mutable reference is actually okay *even when that reference is not exclusive*.
 297 It is only *writing* through a mutable reference that "re-asserts" its
 298 exclusivity.  I will come back to these points later, but let us first go
 299 through another example.
 300
 301 {% highlight rust %}
 302 fn demo2() {
 303   let x = &mut 1u8; // tag: `Uniq(0)`
 304   // stack: [Uniq(0)]; not frozen
 305
 306   let y = &*x; // tag: `Shr(Some(1))`
 307   // stack: [Uniq(0), Shr]; frozen since 1
 308
 309   // The `x` here really is a `&*x`, but we have already seen above what
 310   // happens: `Uniq(0)` must be in the stack, but we leave it unchanged.
 311   let z = x as *const u8 as *mut u8; // tag irrelevant because raw
 312   // stack: [Uniq(0), Shr]; frozen since 1
 313
 314   // A write access through a raw pointer: Unfreeze the location and make sure
 315   // that `Shr` is at the top of the stack.
 316   unsafe { *z = 3; }
 317   // stack: [Uniq(0), Shr]; not frozen
 318
 319   // Access through `y`.  There is a timestamp in the `Shr` tag, and the type
 320   // `u8` does not allow interior mutability, but the location is not frozen.
 321   // This is undefined behavior.
 322   let _val = *y;
 323 }
 324 {% endhighlight %}
 325
 326 ### 2.2 Dereferencing a Pointer
 327 [section 2.2]: #22-dereferencing-a-pointer
 328
 329 As we have seen, we consider the tag of a pointer already when dereferencing it,
 330 before any memory access happens.  The operation on a dereference never mutates
 331 the stack, but it performs some basic checks that might declare the program UB.
 332 The reason for this is twofold: First of all, I think we should require some
 333 basic validity for pointers that are dereferenced even when they do not access
 334 memory. Secondly, there is the practical concern for the implementation in miri:
 335 When we dereference a pointer, we are guaranteed to have type information
 336 available (crucial for things that depend on the presence of an `UnsafeCell`),
 337 whereas having type information on every memory access would be quite hard to
 338 achieve in miri.
 339
 340 Notice that on a dereference, we have *both* a tag at the pointer *and* the type
 341 of a pointer, and the two might not agree which we do not always want to rule
 342 out (we might have raw or shared pointers with a unique tag, for example).
 343
 344 The following checks are done on every pointer dereference:
 345
 346 1. If this is a raw pointer, do nothing and reset the tag used for the access to
 347    `Shr(None)`.  Raw accesses are checked as little as possible.
 348 2. If this is a unique reference and the tag is `Shr(Some(_))`, that's an error.
 349 3. If the tag is `Uniq`, make sure there is a matching `Uniq` item with the same
 350    ID on the stack of every location this reference points to (the size is
 351    determine with `size_of_val`).
 352 4. If the tag is `Shr(None)`, make sure that either the location is frozen or
 353    else there is a `Shr` item on the stack of every location.
 354 5. If the tag is `Shr(Some(t))`, then the check depends on whether a location is
 355    inside an `UnsafeCell` or not, according to the type of the reference.
 356     - Locations outside `UnsafeCell` must have `frozen_since` set to `t` or an
 357       older timestamp.
 358     - `UnsafeCell` locations must either be frozen or else have a `Shr` item in
 359       their stack (same check as if the tag had no timestamp).
 360
 361 ### 2.3 Accessing Memory
 362 [section 2.3]: #23-accessing-memory
 363
 364 On an actual memory access, we know the tag of the pointer that was used to
 365 access (unless it was a raw pointer, in which case the tag we see is
 366 `Shr(None)`), and we know whether we are reading from or writing to the current
 367 location.  We perform the following operations:
 368
 369 1. If the location is frozen and this is a read access, nothing happens. (even
 370    if the tag is `Uniq`).
 371 2. Unfreeze the location (set `frozen_since` to `None`).
 372 3. Pop the stack until the top item matches the tag of the pointer.
 373     - A `Uniq` item matches a `Uniq` tag with the same ID.
 374     - A `Shr` item matches any `Shr` tag (with or without timestamp).
 375     - When we are reading, a `Shr` item matches a `Uniq` tag.
 376
 377     If, popping the stack, we make it empty, then we have undefined behavior.
 378
 379 To understand these rules better, try going back through the three examples we
 380 have seen so far and applying these rules for dereferencing pointers and
 381 accessing memory to understand how they interact.
 382
 383 The only thing that is subtle and potentially surprising here is that we make a
 384 `Uniq` tag match a `Shr` item and also accept `Uniq` reads on frozen locations.
 385 This is required to make `demo1` work: Rust permits read accesses through
 386 mutable references even when they are not currently actually unique.  Our model
 387 hence has to do the same.
 388
 389 ## 3 Retagging and Creating Raw Pointers
 390
 391 We have talked quite a bit about what happens when we *use* a pointer.  It is
 392 time we take a close look at *how pointers are created*.  However, before we go
 393 there, I would like us to consider one more example:
 394
 395 {% highlight rust %}
 396 fn demo3(x: &mut u8) -> u8 {
 397     some_function();
 398     *x
 399 }
 400 {% endhighlight %}
 401
 402 The question is: Can we move the load of `x` to before the function call?
 403 Remember that the entire point of Stacked Borrows is to enforce a certain
 404 discipline when using references, in particular, to enforce uniqueness of
 405 mutable references.  So we should hope that the answer to that question is "yes"
 406 (and that, in turns, is good because we might use it for optimizations).
 407 Unfortunately, things are not so easy.
 408
 409 The uniqueness of mutable references entirely rests on the fact that the pointer
 410 has a unique tag: If our tag is at the top of the stack (and the location is not
 411 frozen), then any access with another tag will pop our item from the stack (or
 412 cause undefined behavior).  This is ensured by the memory access checks (and the
 413 exception for matching `Uniq` tags with `Shr` items on reads does not affect
 414 this property).  Hence, if our tag is *still* on the stack after some other
 415 accesses happened (and we know it is still on the stack every time we
 416 dereference the pointer, as per the dereference checks described above), we know
 417 that no access through a pointer with a different tag can have happened.
 418
 419 ### 3.1 Guaranteed Freshness
 420
 421 However, what if `some_function` has an exact copy of `x`?  We got `x` from our
 422 caller (whom we do not trust), maybe they used that same tag for another
 423 reference (copied it with `transmute_copy` or so) and gave that to
 424 `some_function`?  There is a simple way we can circumvent this concern: Generate
 425 a new tag for `x`.  If *we* generate the tag (and we know generation never emits
 426 the same tag twice, which is easy), we can be sure this tag is not used for any
 427 other reference.  So let us make this explicit by putting a `Retag` instruction
 428 into the code where we generate new tags:
 429
 430 {% highlight rust %}
 431 fn demo3(x: &mut u8) -> u8 {
 432     Retag(x);
 433     some_function();
 434     *x
 435 }
 436 {% endhighlight %}
 437
 438 These `Retag` instructions are inserted by the compiler pretty much any time
 439 references are copied: At the beginning of every function, all inputs of
 440 reference type get retagged.  On every assignment, if the assigned value is of
 441 reference type, it gets retagged.  Moreover, we do this even when the reference
 442 value is inside the field of a `struct` or `enum`, to make sure we really cover
 443 all references.  (This recursive descend is already implemented, but the
 444 implementation has not landed yet.)  However, we do *not* descend recursively
 445 through references: Retagging a `&mut &mut u8` will only retag the *outer*
 446 reference.
 447
 448 Retagging is the *only* operation that generates fresh tags.  Taking a reference
 449 simply forwards the tag of the pointer we are basing this reference on.
 450
 451 Here is our very first example with explicit retagging:
 452
 453 {% highlight rust %}
 454 fn demo0() {
 455   let x = &mut 1u8;
 456   Retag(x); // tag of `x` gets changed to `Uniq(0)`
 457   // stack: [Uniq(0)]; not frozen
 458
 459   let y = &mut *x;
 460   Retag(x); // tag of `y` gets changed to `Uniq(1)`
 461   // stack: [Uniq(0), Uniq(1)]; not frozen
 462
 463   // Check that `Uniq(1)` is on the stack, then pop to bring it to the top.
 464   *y = 5;
 465   // stack: [Uniq(0), Uniq(1)]; not frozen
 466
 467   // Check that `Uniq(0)` is on the stack, then pop to bring it to the top.
 468   *x = 3;
 469   // stack: [Uniq(0)]; not frozen
 470
 471   // Check that `Uniq(1)` is on the stack -- it is not, hence UB.
 472   let _val = *y;
 473 }
 474 {% endhighlight %}
 475
 476 For each reference, `Retag` does the following (we will slightly refine these
 477 instructions later):
 478
 479 1. Compute a fresh tag, `Uniq(_)` for a mutable reference and `Shr(Some(_))` for
 480    a shared reference.
 481 2. Perform the checks that would also happen when we dereference this reference.
 482 3. Perform the actions that would also happen when an actual access happens
 483    through this reference (for shared references a read access, for mutable
 484    references a write access).
 485 4. If the new tag is `Uniq`, push it onto the stack.  (The location cannot be
 486    frozen: `Uniq` tags are only created for mutable references, and we just
 487    performed the actions of a write access to memory, which unfreezes
 488    locations.)
 489 5. If the new tag is `Shr`:
 490     - If the location is already frozen, we do nothing.
 491     - Otherwise:
 492       1. Push a `Shr` item to the stack.
 493       2. If the location is outside of `UnsafeCell`, it gets frozen with the
 494          timestamp of the new reference.
 495
 496 One high-level way to think about retagging is that it computes a fresh tag, and
 497 then performs a reborrow of the old reference with the new tag.
 498
 499 ### 3.2 When Pointers Escape
 500
 501 Creating a shared reference is not the only way to share a location: We can also
 502 create raw pointers, and if we are careful enough, use them to access a location
 503 from different aliasing pointers.  (Of course, "careful enough" is not very
 504 precise, but the precise answer is the very model I am describing here.)
 505
 506 To account for this, we need one final ingredient in our model: A special
 507 instruction that indicates that a reference was cast to a raw pointer, and may
 508 thus be accessed from these raw pointers in a shared way.  Consider the
 509 [following example](https://play.rust-lang.org/?version=stable&mode=debug&edition=2015&gist=253868e96b7eba85ef28e1eabd557f66):
 510
 511 {% highlight rust %}
 512 fn demo4() {
 513   let x = &mut 1u8;
 514   Retag(x); // tag of `x` gets changed to `Uniq(0)`
 515   // stack: [Uniq(0)]; not frozen
 516
 517   // Make sure what `x` points to is accessible through raw pointers.
 518   EscapeToRaw(x);
 519   // stack: [Uniq(0), Shr]; not frozen
 520
 521   let y1 = x as *mut u8;
 522   let y2 = y1;
 523   unsafe {
 524     // All of these first dereference a raw pointer (no checks, tag gets
 525     // ignored) and then perform a read or write access with `Shr(None)` as
 526     // the tag, which is already the top of the stack so nothing changes.
 527     *y1 = 3;
 528     *y2 = 5;
 529     *y2 = *y1;
 530   }
 531
 532   // Writing to `x` again pops `Shr` off the stack, as per the rules for
 533   // write accesses.
 534   *x = 7;
 535   // stack: [Uniq(0)]; not frozen
 536
 537   // Any further access through the raw pointers is undefined behavior, even
 538   // reads: The write to `x` re-asserted that `x` is the unique reference for
 539   // this memory.
 540   let _val = unsafe { *y1 };
 541 }
 542 {% endhighlight %}
 543
 544 The behavior of `EscapeToRaw` is best described as "reborrowing for a raw
 545 pointer": The steps are the same as for `Retag` above, except that the new
 546 pointer's tag is `Shr(None)` and we do not freeze (i.e., we behave as if the
 547 entire pointee was inside an `UnsafeCell`).
 548
 549 Knowing about both `Retag` and `EscapeToRaw`, you can now go back to `demo2` and
 550 should be able to fully explain why the stack changes the way it does not that
 551 example.
 552
 553 ### 3.3 The Case of the Aliasing References
 554
 555 Everything I described so far was pretty much in working condition as of about a
 556 week ago.  However, there was one thorny problem that I only discovered fairly
 557 late, and as usual it is best demonstrated by an example -- entirely in safe
 558 code:
 559
 560 {% highlight rust %}
 561 fn demo_refcell() {
 562   let rc = &mut RefCell::new(23u8);
 563   Retag(rc); // tag gets changed to `Uniq(0)`
 564   // We will consider the stack of the location where `23` is stored; the
 565   // `RefCell` bookkeeping counters are not of interest.
 566   // stack: [Uniq(0)]
 567
 568   // Taking a shared reference shares the location but does not freeze, due
 569   // to the `UnsafeCell`.
 570   let rc_shr = &*rc;
 571   Retag(rc_shr); // tag gets changed to `Shr(Some(1))`
 572   // stack: [Uniq(0), Shr]; not frozen
 573
 574   // Lots of stuff happens here but it does not matter for this example.
 575   let mut bmut = rc_shr.borrow_mut();
 576
 577   // Obtain a mutable reference into the `RefCell`.
 578   let mut_ref = &mut *bmut;
 579   Retag(mut_ref); // tag gets changed to `Uniq(2)`
 580   // stack: [Uniq(0), Shr, Uniq(2)]; not frozen
 581
 582   // And at the same time, a fresh shared reference to its outside!
 583   // This counts as a read access through `rc`, so we have to pop until
 584   // at least a `Shr` is at the top of the stack.
 585   let shr_ref = &*rc; // tag gets changed to `Shr(Some(3))`
 586   Retag(shr_ref);
 587   // stack: [Uniq(0), Shr]; not frozen
 588
 589   // Now using `mut_ref` is UB because its tag is no longer on the stack.  But
 590   // that is bad, because it is usable in safe code.
 591   *mut_ref += 19;
 592 }
 593 {% endhighlight %}
 594
 595 Notice how `mut_ref` and `shr_ref` alias!  And yet, creating a shared reference
 596 to the memory already covered by our unique `mut_ref` must not invalidate
 597 `mut_ref`.  If we follow the instructions above, when we retag `shr_ref` after
 598 it got created, we have no choice but pop the item matching `mut_ref` off the
 599 stack.  Ouch.
 600
 601 This made me realize that creating a shared reference has to be very weak when
 602 on locations inside `UnsafeCell`.  In fact, it is entirely equivalent to
 603 `EscapeToRaw`: We just have to make sure some kind of shared access is possible,
 604 but we have to accept that there might be active mutable references assuming
 605 exclusive access to the same locations.  That on its own is not enough, though.
 606
 607 I also added a new check to the retagging procedure: Before taking any action
 608 (i.e., before step 3, which could pop items off the stack), we check if the
 609 reborrow is redundant: If the new reference we want to create is already
 610 dereferencable (because it item is already on the stack and, if applicable, the
 611 stack is already frozen), *and* if the item that justifies this is moreover
 612 "derived from" the item that corresponds to the old reference, then we just do
 613 nothing.  Here, "derived from" means "further up the stack".  Basically, the
 614 reborrow has already happened and the new reference is ready for use, and
 615 (because of that "derived from" check), we know that using the new reference
 616 will *not* pop the item corresponding to the old reference off the stack.  In
 617 that case, we avoid popping anything, to keep other references valid.
 618
 619 This rule applies in our example above when we create `shr_ref` from `mut_ref`.
 620 There is already a `Shr` on the stack (so the new reference is dereferencable),
 621 and the item matching the old reference (`Uniq(0)`) is below that `Shr` (so
 622 after using the new reference, the old one remains dereferencable).  Hence we do
 623 nothing, keeping the `Uniq(2)` on the stack, such that the access through
 624 `mut_ref` at the end remains valid.
 625
 626 This may sound like a weird rule, and it is.  I would surely not have thought of
 627 this if `RefCell` would not force our hands here.  However, as we shall see in
 628 [section 5], it also does not to break any of the important properties of the
 629 model (mutable references being unique and shared references being read-only
 630 except for `UnsafeCell`).  Moreover, when pushing an item to the stack (at the
 631 end of the retag action), we can be sure that the stack is not yet frozen: If it
 632 was frozen, the reborrow would be redundant.
 633
 634 With this extension, the instructions for retagging and `EscapeToRaw` now look
 635 as follows:
 636
 637 1. Compute a fresh tag: `Uniq(_)` for a mutable reference, `Shr(Some(_))` for a
 638    shared reference, `Shr(None)` if this is `EscapeToRaw`.
 639 2. Perform the checks that would also happen when we dereference this reference.
 640    Remember the position of the item matching the tag in the stack.
 641 3. Redundancy check: If the new tag passes the checks performed on a
 642    dereference, and if the item that makes this check succeed is *above* the one
 643    we remembered in step 2 (where the "frozen" state is considered above every
 644    item in the stack), then we stop.  We are done for this location.
 645 4. Perform the actions that would also happen when an actual access happens
 646    through this reference (for shared references a read access, for mutable
 647    references a write access).
 648    Now the location cannot be frozen any more: If the fresh tag is `Uniq`, we
 649    just unfroze, if the fresh tag is `Shr` and the location was already frozen
 650    then the redundancy check (step 3) would have kicked in.
 651 5. If the new tag is `Uniq`, push it onto the stack.
 652 6. If the new tag is `Shr`, push a `Shr` item to the stack.  Then, if the
 653    location is outside of `UnsafeCell`, it gets frozen with the timestamp of the
 654    new reference.
 655
 656 The one thing I find slightly unsatisfying about the redundancy check is that it
 657 seems to overlap a bit with the rule that on a *read* access, a `Shr` item
 658 matches a `Uniq` tag.  Both of these together enable the read-only use of
 659 mutable references that have already been shared; I would prefer to have a
 660 single condition enabling that instead of two working together.
 661
 662 ## 4 Differences to the Original Proposal
 663 [section 4]: #4-differences-to-the-original-proposal
 664
 665 The key differences to the original proposal is that the check performed on a
 666 dereference, and the check performed on an access, are not the same check.  This
 667 means there are more "moving parts" in the model, but it also means we do not
 668 need a weird special exception for `demo1` any more like the original proposal
 669 did.  The main reason for this, however, is that on an access, we just do not
 670 know if we are inside an `UnsafeCell` or not, so we cannot do all the checks we
 671 would like to do.  Accordingly, I also rearranged terminology a bit.  There is
 672 no longer one "reactivation" action, instead there is a "deref" check and an
 673 "access" action, as described above in sections [2.2][section 2.2] and
 674 [2.3][section 2.3].
 675
 676 Beyond that, I made the behavior of shared references and raw pointers more
 677 uniform.  This helped to fix test failures around `iter_mut` on slices, which
 678 first creates a raw reference and then a shared reference: In the original
 679 model, creating the shared reference invalidates previously created raw
 680 pointers.  As part of unifying the two, this happens no longer.
 681 (Coincidentally, I did not make this change with the intention of fixing
 682 `iter_mut`.  I did this change because I wanted to reduce the number of case
 683 distinctions in the model.  Then I realized the relevant test suddenly passed
 684 even with the full model enabled, investigated what happened, and realized I
 685 accidentally had had a great idea. :D )
 686
 687 The tag is now "typed" (`Uniq` vs `Shr`) to be able to support `transmute`
 688 between references and shared pointers.  Such `transmute` were an open question
 689 in the original model and some people raised concerns about it in the ensuing
 690 discussion.  I invite all of you to come up with strange things you think you
 691 should be able to `transmute` and throw them at miri so that we can see if your
 692 use-cases are covered. :)
 693
 694 Creating a shared reference now always pushes a `Shr` item onto the stack, even
 695 when there is no `UnsafeCell`. This means that starting with a mutable reference
 696 `x`, `&*x as *const _ as *mut _` is pretty much equivalent to `x as *mut`.  This
 697 came up during the implementation because I realized that in `x as *const _` on
 698 a mutable reference, `x` actually first gets coerced to shared reference, which
 699 then gets cast to a raw pointer.  This happens in `NonNull::from`, so if you
 700 later write to that `NonNull`, you end up writing to a raw pointer that was
 701 created from a shared reference.  Originally I intended this to be strictly
 702 illegal.  This is writing to a shared reference after all, how dare you!
 703 However, it turns out it's actually no big deal *if the shared reference does
 704 not get used again later*.  This is an access-based model after all, if a
 705 reference never gets used again we do not care much about enforcing any
 706 guarantees for it.  (This is another example of a coincidental fix, where I had
 707 a surprisingly passing test case and then investigated what happened.)
 708
 709 The redundancy check during retagging can be seen as refining a similar check
 710 that the original model did whenever a new reference was created (where we
 711 wouldn't change the state if the new borrow is already active).
 712
 713 Finally, the notion of "function barriers" from the original Stacked Borrows has
 714 not been implemented yet.  This is the next item on my todo list.
 715
 716 ## 5 Key Properties
 717 [section 5]: #5-key-properties
 718
 719 Let us look at the two key properties that I set out as design goals, and see
 720 how the model guarantees that they hold true in all valid (UB-free) executions.
 721
 722 ### 5.1 Mutable References are Unique
 723
 724 The property I would like to establish here is that: After creating (retagging,
 725 really) a `&mut`, if we then run some unknown code *that does not get passed the
 726 reference* nor do we derive another reference from ours, and then we use the
 727 reference again (reading or writing), we can be sure that this unknown code did
 728 not access the memory behind our mutable reference at all (or we have UB).  For
 729 example:
 730
 731 {% highlight rust %}
 732 fn demo_mut_unique(our: &mut i32) -> i32 {
 733   Retag(our); // So we can be sure the tag is unique
 734
 735   *our = 5;
 736
 737   unknown_code();
 738
 739   // We know this will return 5, and moreover if `unknown_code` does not panic
 740   // we know we could do the write after calling `unknown_code` (because it
 741   // cannot even read from `our`).
 742   *our
 743 }
 744 {% endhighlight %}
 745
 746 The proof sketch goes as follows: After retagging the reference, we know it is
 747 at the top of the stack and the location is not frozen.  (The "redundant
 748 reborrow" rule does not apply because a fresh `Uniq` tag can never be
 749 redundant.)  For any access performed by the unknown code, we know that access
 750 cannot use the tag of our reference because the tags are unique and not
 751 forgeable.  Hence if the unknown code accesses our locations, that would pop our
 752 tag from the stack.  When we use our reference again, we know it is on the
 753 stack, and hence has not been popped off.  Thus there cannot have been an access
 754 from the unknown code.
 755
 756 Actually this theorem applies *any time* we have a reference whose tag we can be
 757 sure has not been leaked to anyone else, and which points to locations which
 758 have this tag at the top of the (unfrozen) stack.  This is not just the case
 759 immediately after retagging.  We know our reference is at the top of the stack
 760 after writing to it, so in the following example we know that `unknown_code_2`
 761 cannot access `our`:
 762
 763 {% highlight rust %}
 764 fn demo_mut_advanced_unique(our: &mut u8) -> u8 {
 765   Retag(our); // So we can be sure the tag is unique
 766
 767   unknown_code_1(&*our);
 768
 769   // This "re-asserts" uniqueness of the reference: After writing, we know
 770   // our tag is at the top of the stack.
 771   *our = 5;
 772
 773   unknown_code_2();
 774
 775   // We know this will return 5
 776   *our
 777 }
 778 {% endhighlight %}
 779
 780 ### 5.2 Shared References (without `UnsafeCell)` are Read-only
 781
 782 The key property of shared references is that: After creating (retagging,
 783 really) a shared reference, if we then run some unknown code (it can even have
 784 our reference if it wants), and then we use the reference again, we know that
 785 the value pointed to by the reference has not been changed.  For example:
 786
 787 {% highlight rust %}
 788 fn demo_shr_frozen(our: &u8) -> u8 {
 789   Retag(our); // So we can be sure the tag actually carries a timestamp
 790
 791   // See what's in there.
 792   let val = *our;
 793
 794   unknown_code(our);
 795
 796   // We know this will return `val`
 797   *our
 798 }
 799 {% endhighlight %}
 800
 801 The proof sketch goes as follows: After retagging the reference, we know the
 802 location is frozen (this is the case even if the "redundant reborrow" rule
 803 applies).  If the unknown code does any write, we know this will unfreeze the
 804 location.  The location might get re-frozen, but only at the then-current
 805 timestamp.  When we do our read after coming back from the unknown code, this
 806 checks that the location is frozen *at least* since the timestamp given in its
 807 tag, so if the location is unfrozen or got re-frozen by the unknown code, the
 808 check would fail.  Thus the unknown code cannot have written to the location.
 809
 810 One interesting observation here for both of these proofs is that all we rely on
 811 when the unknown code is executed are the actions performed on every memory
 812 access.  The additional checks that happen when a pointer is dereferenced only
 813 matter in *our* code, not in the foreign code.  This indicates that we could see
 814 the checks on pointer dereference as another "shadow state operation" next to
 815 `Retag` and `EscapeToRaw`, and then these three operations plus the actions on
 816 memory accesses are all that there is to Stacked Borrows.  This is difficult to
 817 implement in miri because dereferences can happen any time a path is evaluated,
 818 but it is nevertheless interesting and might be useful in a "lower-level MIR"
 819 that does not permit dereferences in paths.
 820
 821 ## 6 Evaluation, and How You Can Help
 822 [section 6]: #6-evaluation-and-how-you-can-help
 823
 824 I have implemented both the validity invariant and the model as described above
 825 in miri. This [uncovered](https://github.com/rust-lang/rust/issues/54908) two
 826 [issues](https://github.com/rust-lang/rust/issues/54957) in the standard
 827 library, but both were related to validity invariants, not Stacked Borrows.
 828 With these exceptions, the model passes the entire test suite.  There were some
 829 more test failures in earlier versions (as mentioned in [section 4]), but the
 830 final model accepts all the code covered by miri's test suite.  Moreover I wrote
 831 a bunch of compile-fail tests to make sure the model catches various violations
 832 of the key properties it should ensure.
 833
 834 However, miri's test suite is tiny, and I have but one brain to come up with
 835 counterexamples!  In fact I am quite a bit worried because I literally came up
 836 with `demo_refcell` less than two weeks ago, so what else might I have missed?
 837 This where you come in.  Please test this model!  Come up with something funny
 838 you think should work (I am thinking about funny `transmute` in particular,
 839 using type punning through unions or raw pointers if you prefer that), or maybe
 840 you have some crate that has some unsafe code and a test suite (you do have a
 841 test suite, right?) that might run under miri.
 842
 843 The easiest way to try the model is the
 844 [playground](https://play.rust-lang.org/): Type the code, select "Tools - Miri",
 845 and you'll see what it does.
 846
 847 For things that are too long for the playground, you have to install miri on
 848 your own computer.  miri depends on rustc nightly and has to be updated
 849 regularly to keep working, so it is not currently on crates.io.  Installation
 850 instructions for miri are provided
 851 [in the README](https://github.com/solson/miri/#running-miri).  Please let me
 852 know if you are having trouble with anything.  You can report issues, comment on
 853 this post or find me in chat (as of recently, I am partial to Zulip where we
 854 have an
 855 [unsafe code guidelines stream](https://rust-lang.zulipchat.com/#narrow/stream/136281-wg-unsafe-code-guidelines)).
 856
 857 With miri installed, you can `cargo miri` a project with a binary to run it in
 858 miri.  Dependencies should be fully supported, so you can use any crate you
 859 like.  It is not unlikely, however, that you will run into issues because miri
 860 does not support some operation.  In that case please search the
 861 [issue tracker](https://github.com/solson/miri/issues) and report the issue if
 862 it is new.  We cannot support everything, but we might be able to do something
 863 for your case.
 864
 865 Unfortunately, `cargo miri test` is currently broken; if you want to help with
 866 that [here are some details](https://github.com/solson/miri/issues/479).
 867 Moreover, wouldn't it be nice if we could run the entire libcore, liballoc and
 868 libstd test suite in miri?  There are tons of interesting cases of Rust's core
 869 data structures being exercise there, and the comparatively tiny miri test suite
 870 has already helped to find two soundness bugs, so there are probably more.  Once
 871 `cargo miri test` works again, it would be great to find a way to run it on the
 872 standard library test suites, and set up something so that this happens
 873 automatically on a regular basis (so that we notice regressions).
 874
 875 As you can see, there is more than enough work for everyone.  Don't be shy!  I
 876 have a mere three weeks left on this internship, after which I will have to
 877 significantly reduce my Rust activities in favor of finishing my PhD.  I won't
 878 disappear entirely though, don't worry -- I will still be able to mentor you if
 879 you want to help with any of the above tasks. :)