From: Ralf Jung Date: Wed, 31 Jan 2018 10:25:05 +0000 (+0100) Subject: New post: Sharing for a lifetime X-Git-Url: https://git.ralfj.de/web.git/commitdiff_plain/aacb1fee402c329bb4a85d5f02527a72d86ac9c3?ds=sidebyside New post: Sharing for a lifetime --- diff --git a/ralf/_posts/2018-01-31-sharing-for-a-lifetime.md b/ralf/_posts/2018-01-31-sharing-for-a-lifetime.md new file mode 100644 index 0000000..25b0bec --- /dev/null +++ b/ralf/_posts/2018-01-31-sharing-for-a-lifetime.md @@ -0,0 +1,158 @@ +--- +title: "Sharing for a Lifetime" +categories: research rust +--- + +This post is about an aspect of the [RustBelt paper]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}). +Concretely, I'd like to share some of our thoughts on the nature of types and shared references. +Let's see how well this goes. :) + +Shared references are an extremely powerful mechanism in the Rust type system, and we've had quite some trouble fining a good way for handling them in our [formal model]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}). +In this post, I will present the model that we came up with. + +Let's get started by discussing what a "model" of a type looks like, and what's so hard about handling shared references. + +## Immutable Sharing + +The reason shared references are so subtle is [interior mutability](https://doc.rust-lang.org/1.22.0/reference/interior-mutability.html). +This fundamentally changes the nature of shared references. +To see how, I will briefly assume that Rust does *not* permit interior mutability, and describe how a model of Rust types could roughly look like. + +### What's in a Type? + +I've already [written]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}) [about]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}) the idea of "semantic types", which essentially boils down to the concept that types can impose *additional invariants* on their fields, like `self.cap >= self.len` (for `Vec`) or `!is_zero(self.0)` (for `NonZero`). +So, whenever I say "(semantic) type", I really mean "a type's (syntactic) definition (aka the data layout) **and** the additional invariants that the module imposes on this type". +Two types may be syntactically equal but have different invariants, and then clearly casting from one to the other would be horribly unsafe -- so, these would be completely different types despite their syntactic similarity. +What really matters for a type is the answer to the question: *Given some sequence of bytes, is this sequence a valid inhabitant of the type?*[^1] +If we can answer this question for every Rust type, we have found a "(semantic) model" of the Rust type system. +The (syntactic) data layout imposes some basic structure on the permitted sequences (like their length), but the invariants can arbitrarily *refine* the set of sequences that are allowed at the given type by imposing additional requirements. +(Indeed, [refinement types](https://en.wikipedia.org/wiki/Refinement_(computing)#Refinement_types) are a way to reflect these additional restrictions into the type system, but that's not what I want to talk about today.) + +[^1]: I should mention at this point that "sequences of bytes" is a subtle concept because of complications like uninitialized memory and pointers. I hope to come back to this in a future post; it should not matter much here. + +For example, the sequence of four `0` bytes is valid at `i32` but not at `NonZero`. +As another example, a pointer is valid at `Box` if it points to four bytes of *owned* memory[^2], and these four bytes are a valid `i32`. +Finally, a pointer is valid at `&'a mut T` if it points to `size_of::()` many bytes of memory[^3] that are *owned for lifetime `'a`*, and those bytes are valid at `T`. + +[^2]: I will also mostly ignore how the concept of *ownership* can be made more precise. If you want to dig deeper, the keyword is [separation logic](https://en.wikipedia.org/wiki/Separation_logic). We had to extend separation logic to be able to handle "ownership for some lifetime", but you don't have to understand any of that to follow this post. That's the plan, anyway. + +[^3]: Yes, we do ignore unsized types here. + +So, the path to handling shared references in our model seems clear: We have to find out when a pointer is valid at `&'a T`. +Since we ignore interior mutability, we can answer this question fairly easily (at least if don't want to go fully formal on this, which right now, we do not ;): While the lifetime `'a` lasts, the pointer has to point to `size_of::()` many *immutable* bytes of memory (meaning the bytes do not change during this lifetime; we impose no restrictions on earlier or later modifications); and moreover, those bytes are valid at `T`. + +Given this model, we can now explain why `*x` works for *any* `x: &'a T` if `T: Copy`: +The access can only happen during the lifetime `'a`, so we can dereference the pointer. +Moreover, since `T: Copy`, we know that we can copy the bytes and have *both* sequences (the original and the copy) be valid at `T` (remember that "being a valid sequence" could impose ownership requirements like `Box` does, so this kind of duplication is not possible in general for all `T`). +Finally, since we know the memory is immutable, we can do this access even though we don't fully own the memory: We can rely on everyone else (i.e., other threads) also just performing read operations on this memory. +Overall, we conclude that we did not violate any of the invariants imposed on `x`, and the data we copied is a valid `T`. + +## Interior Mutability + +Interior mutability however breaks this model of `&'a T`: We just cannot declare that the pointed-to memory is read-only in general. +If we did, [`Cell::set`](https://doc.rust-lang.org/beta/std/cell/struct.Cell.html#method.set) and many more operations would be blatantly unsafe. +Indeed this is the reason shared references are not called "immutable references" even though they are generally considered the dual of mutable references. +(My personal thinking is that mutable references should really be called "unique references" to instill the point that the distinction between the two is about uniqueness, not about mutability. Oh well.) + +So, with our old model broken, what could we do instead? Let us consider some examples. + +### `&Cell` + +Why is `Cell::set` a safe operation? +The reason is twofold. +First of all, `Cell` is not `Sync`, so it is not possible to send an `&Cell` from one thread to another. +As a consequence, all shared references to a given `Cell` always reside in the same thread. +This already excludes the possibility of data races. +Secondly, `Cell` does not permit obtaining a *deep pointer* into the data. +So, changing a `Cell>` from `Ok(true)` to `Err(42)` cannot invalidate any pointers that still expect a `bool` in the `Result`. + +Coming back to our model, we could say that `&'a Cell` is a pointer valid for lifetime `'a` that points to `size_of::()` many bytes of memory that (for this lifetime) may *be accessed by any code running in the current thread, but not by any other code*; and moreover, those bytes are valid at `T`. +This definition forces `&Cell` to *not* be `Send` (because the model refers to the "current thread", so changing that thread could violate the invariant) and hence `Cell` must not be `Sync`. +It also rules out pointers into a `Cell>`: Say we would want to have a pointer to the `Ok` data, which would have type `&Cell`. +We could not show that this pointer is actually valid at the given type, because anyone in the current thread could modify the (outer) `Cell>` at any time, and if they write an `Err(42)` in it they will violate the requirements that an `&Cell` only ever points to a valid `bool`! +So, both of the restrictions that we discussed above follow directly from the invariant for `&Cell`. + +### `&Mutex` + +Coming to `Mutex`, the justification for why `Mutex::lock` and `MutexGuard::deref_mut` are safe operations is again different: +A `Mutex` implements *mutual exclusion* (hence the name), so only one thread can ever hold the lock. +As a result, giving that thread mutable access to the inner data cannot result in any data races or other conflicts with other threads. + +In the model (and I'm going to gloss over more and more details here), we will say that `&'a Mutex` is valid if everyone accessing the memory it points to follows the locking discipline. +A `MutexGuard` is valid if we actually own the lock of the corresponding `Mutex`. +Together, this is sufficient to justify safety of all the `Mutex` and `MutexGuard` operations. + +## Sharing is Different + +We now come back to the question: *When is a pointer valid at type `&'a T`?* +As we have seen above, the answer to this question actually depends significantly on the type `T`! +For `i32` or `bool`, our old read-only model still works, but for `Cell` or `Mutex`, we need entirely different invariants that are very specific to the respective type. + +To be able to handle this diversity, we have to *change our idea of what a type is*. +I've said above that a type is all about the question: *Given some sequence of bytes, is this sequence a valid inhabitant of the type?* +We now extend this by saying a type *also* has to be able to answer the question: *Given a pointer and a lifetime, is this pointer a valid shared reference to this type for the given lifetime?* +In other words, **every type gets to *pick* for itself what it means to share this type**. + +While I let that sink in, let me remark that this also explains why Rust has `Send` *and* `Sync`, and why neither implies the other even though they seem so closely related. +This has certainly puzzled me when I first learned about these marker traits. +The reason is that `T` and `&T` are really very different types; the author of `T` picks the invariants for both of these types (mostly) *independently* -- so one of them being safe to send to another thread has no bearing on the other one. + +The main consequence of this choice is that for every type with extra invariants, we have to pick *two* invariants: +1. An invariant specifying the sequences of bytes that are valid for the *owned* case (this includes mutable references), and +1. An invariant specifying the pointers that are valid for the *shared* case at a given lifetime. + +In principle, this applies to *all* types, even types like `Vec` that don't really care about interior mutability themselves. +However, because we have to handle types like `&Vec>`, we cannot just skip this problem; interior mutability as a feature "infects" the entire type system. +This should however usually not be a big problem: `&'a Vec` can just have the invariant that `len` and `cap` are read-only, and all the initialized elements of the vector are themselves shared for lifetime `'a`. +This strategy of just using the shared invariant of the "inner type" and making the rest read-only should be applicable to the vast majority of types. + +## Unsharing + +Of course, the two invariants making up a type cannot be entirely unrelated. +First of all, given some `y: T`, it must be possible to somehow "start sharing" it and obtain `&y: &T`. +Moreover, if `T: Copy`, then it must be possible to read `*x` where `x: &T` and obtain a copy of a `T`. +This was automatically the case in our first model where sharing was uniformly defined as being read-only memory, but now that we made sharing "configurable", we have to explicitly require this property. + +However, that's not enough: At some point, sharing will end, and some part of the program expects the owned invariant to hold again. +Somehow, everything that happened when sharing started must be un-done -- I think of this as "un-sharing". +This is tricky because shared references can be, well, shared (d'oh). +Even for the simple case of read-only sharing, we need to make sure that *nobody* still uses the old read-only access to the memory covered by the shared reference. +In Rust, this works because the lifetime of these old shared references will have expired when the "un-sharing" happens. + +This is why, in our model, *sharing is inherently connected to a lifetime*. +If you go back to what I wrote above about a type being two invariants, you will notice that the second invariant is about sharing an inhabitant of `T` *for some lifetime*. +There is no sharing without a lifetime. +Whenever you start sharing something (i.e., whenever a `&x` is created from previously owned data), our model demands that you announce the lifetime for which this sharing will last. +Once the lifetime ends, un-sharing happens automatically, so we don't need to explicitly talk about it in our model. +(Of course, the proof effort related to this does not just go away. Indeed we now have to follow the rules for what one can do with things that are only owned for some lifetime, and those rules are restrictive enough to guarantee that un-sharing is always possible.) + +## Conclusion + +I have explained the idea of types having a separate "sharing invariant" that we came up with in the RustBelt paper to be able to handle interior mutability. +To my knowledge, this is the only formal treatment of interior mutability to date, so we were all kind of stabbing in the dark until we found an approach that worked. +This is not so say that the last word on interior mutability has been spoken; there may well be other ways of justifying why `Mutex` or `Cell` are safe. + +Either way, this model has certainly helped me not only in the formal work on RustBelt, but also in thinking about the Rust type system itself. +For example, I briefly wondered why `Cell` is not covariant. +Indeed, if `U <: T`, then it is perfectly safe to turn a `Cell` into a `Cell` -- that is to say, whenever some sequence of bytes is valid at `Cell`, it is also valid at `Cell`. +(In fact, the owned invariant for `Cell` is *exactly the same* as the one for `T`!) +That's precisely what we need for covariance, right? +Well, not exactly -- because our types have two invariants, we also have to require that whenever a pointer is a shared `Cell`, it is a shared `Cell`, and that is decisively *not* true! +So, *owned* `Cell`s are covariant, but shared `Cell`s are not. +One could think of an extension to the Rust type system that separates owned subtyping from shared subtyping, and then `Cell` could indeed be "owned-covariant". + +In closing this post, I'd like to leave you with some more general thoughts. +One way to explain the role of shared references in the Rust type system more abstractly is that `&T` switches `T` into "shared mode". +Every type comes with two modes, "owned" and "shared", and we typically think only of the "owned" mode because that's the one that's relevant for `x: T`. +However, Rust provides a way to switch into the other mode, and types can have a completely different "face" there -- for example, `Cell` and `T` are *equal* in "owned" mode, but very different in "shared" mode. +I don't know of any other type system having a comparable concept. +However, I've also not seen Rust's shared references being discussed as a super-special magic feature (though, after doing this research, I think they are!) -- it's rather the mutable references that seem special due to their unique nature (and this *was* a revolution, called mutpocalypse). +So, maybe this is some extremely ingenious idea that the Rust devs had without even realizing, or maybe it's just some artifact of our model that's not worth losing many words about. +(I do have my opinion, of course. ;) + +And then there is the question: Why exactly two modes? Would it be worth having more? +I honestly don't know, but I had to think of this a while back when I read [this post by glaebhoerl](https://internals.rust-lang.org/t/impl-t-clone-for-cell-rc-t/5397), the first part of which I would translate as "`Cell` could be considered a mode". + +If you have any comments or thoughts on this, please join the discussion! +I will create a thread in the Rust internal forums shortly. +I'd also be interested in feedback on how understandable this post is; this is my first attempt at translating research results into a blog post.