X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/021466ba05c160cfa526c7946a14c6ac4763b480..2b47ffb5fd8cc2b4027bb9347c6144bb9c66e846:/personal/_posts/2018-04-10-safe-intrusive-collections-with-pinning.md?ds=sidebyside diff --git a/personal/_posts/2018-04-10-safe-intrusive-collections-with-pinning.md b/personal/_posts/2018-04-10-safe-intrusive-collections-with-pinning.md index a3f0cca..0ffae02 100644 --- a/personal/_posts/2018-04-10-safe-intrusive-collections-with-pinning.md +++ b/personal/_posts/2018-04-10-safe-intrusive-collections-with-pinning.md @@ -4,7 +4,7 @@ categories: research rust forum: https://internals.rust-lang.org/t/safe-intrusive-collections-with-pinning/7281 --- -In my [last post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}), I talked about the new ["pinned references"](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md) which guarantee that the data at the memory it points to will not, ever, be moved elsewhere. +In my [last post]({% post_url 2018-04-05-a-formal-look-at-pinning %}), I talked about the new ["pinned references"](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md) which guarantee that the data at the memory it points to will not, ever, be moved elsewhere. I explained how they enable giving a safe API to code that could previously only be exposed with `unsafe`, and how one could go about proving such a thing. This post is about another application of pinned references---another API whose safety relies on the pinning guarantees: Intrusive collections. It turns out that pinned references can *almost* be used for this, but not quite. @@ -141,7 +141,7 @@ This is possible despite there being no guarantee that the entry will outlive th Then we `drop` the entry while the collection still exists, and we can see it has vanished from the collection as well. Notice that using `Pin` in the `insert` method above is crucial: If the collection of the entry were to move around, their respective pointers would get stale! -This is fundamentally the same problem as [`SelfReferential` in my previous post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}), and `Pin` helps. +This is fundamentally the same problem as [`SelfReferential` in my previous post]({% post_url 2018-04-05-a-formal-look-at-pinning %}), and `Pin` helps. Thanks to this guarantee, and unlike in the intrusive-collections crate, `insert` can be called with entries *that do not outlive the collection*. With an [API for stack pinning](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md#stack-pinning-api-potential-future-extension), we could even have put the `entry` in `main` on the stack. @@ -179,7 +179,7 @@ fn main() { } {% endhighlight %} -Now, `PinBox::deallocate` is pretty artificial, but in fact the [API for stack pinning](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md#stack-pinning-api-potential-future-extension) that is drafted in the RFC, together with [`ManuallyDrop`](https://doc.rust-lang.org/beta/std/mem/union.ManuallyDrop.html), make it possible to obtain a `Pin` to a stack-allocated `T` and subsequently pop the stack frame without calling `drop` on the `T`. +Now, `PinBox::deallocate` is pretty artificial, but in fact the [API for stack pinning](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md#stack-pinning-api-potential-future-extension) that is drafted in the RFC, together with [`ManuallyDrop`](https://doc.rust-lang.org/stable/std/mem/struct.ManuallyDrop.html), make it possible to obtain a `Pin` to a stack-allocated `T` and subsequently pop the stack frame without calling `drop` on the `T`. That has the same effect as `PinBox::deallocate`: It renders our collection interface unsound. The concern about dropping pinned data is real. @@ -234,7 +234,7 @@ For now, that seems worth it; if one day we decide that pinning ought to be more ## 2 The Formal Perspective -In this second part of the post, we are going to try and apply the formal methodology from the [previous post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}) to the intrusive collection example above. +In this second part of the post, we are going to try and apply the formal methodology from the [previous post]({% post_url 2018-04-05-a-formal-look-at-pinning %}) to the intrusive collection example above. I am going to assume that you have read that post. ### 2.1 The Intrusive Collection Invariant @@ -266,7 +266,7 @@ Collection.pin(ptr) := exists |entries: List| ) ``` Notice how we iterate over the elements of the list and make sure that we own whatever memory is to own there. -(I love how `all` [really exists for iterators](https://doc.rust-lang.org/beta/std/iter/trait.Iterator.html#method.all) so I can express quantification over lists without any hassle. :) +(I love how `all` [really exists for iterators](https://doc.rust-lang.org/stable/std/iter/trait.Iterator.html#method.all) so I can express quantification over lists without any hassle. :) This invariant already justifies `print_all`: All the entries we are going to see there are in the list, so we have their `T.pin` at our disposal and are free to call `Debug::fmt`. @@ -401,7 +401,7 @@ That's a good question! It seems perfectly fine to change `insert` to take `&Pin>`. I can't come up with any counter-example. However, the formal model also cannot justify this variant of `insert`: Our model as defind previously defines `Pin<'a, T>.shr` in terms of `T.shr`. -That made it easy to justify [`Pin::deref`](https://doc.rust-lang.org/nightly/std/mem/struct.Pin.html#method.deref). +That made it easy to justify `Pin::deref`. However, as a consequence, `Pin<'a, T>.shr` and `(&'a T).shr` are literally the same invariant, and hence `&Pin` and `&&T` *are the same type*. We could literally write functions transmuting between the two, and we could justify them in my model. Clearly, `insert` taking `entry: &&T` would be unsound as nothing stops the entry from being moved later: @@ -450,7 +450,7 @@ I feel bad about that. Can we still fix this before everything gets stabilized? Others [have](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-373206171) [argued](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-378555691) for a shared pinned reference after it got removed from the API, and I argued against them as I did not understand how it could be useful. Thanks for not being convinced by me! -However, there is one strange aspect to this "shared pinned" typestate: Due to [`Pin::deref`](https://doc.rust-lang.org/beta/std/mem/struct.Pin.html#method.deref), we can turn a "shared pinned" reference into a shared reference. +However, there is one strange aspect to this "shared pinned" typestate: Due to `Pin::deref`, we can turn a "shared pinned" reference into a shared reference. We can go from `&Pin` to `&&T`. In other words, the shared pinned typestate invariant must *imply* the invariant for the (general, unpinned) shared typestate. The reason for `Pin::deref` to exist is mostly a rustc implementation detail [related to using `Pin` as the type of `self`](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-372475895), but this detail has some significant consequences: Even with a separate shared pinned typestate, we can still not define `RefCell` in a way that a pinned `RefCell` pins its contents. @@ -459,7 +459,7 @@ Removing `Pin::deref` (or restricting it to types that implement `Unpin`) would I spelled out the details [in the RFC issue](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-372109981). So, if we want to declare that shared pinning is a typestate in its own right---which overall seems desirable---do we want it to be restricted like this due to an implementation detail of arbitrary self types? -**Update:** @Diggsey [points out](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-379230538) that we can still have a `PinRefCell` with a method like `fn get_pin(self: Pin>) -> Pin`, as long as the `PinRefCell` never gives out mutable references. So it turns out that combining interior mutability and pinning should work fine. Later, @glaebhoerl suggested we can even [combine `RefCell` and `PinRefCell` into one type if we dynamically track the pinning state](https://internals.rust-lang.org/t/safe-intrusive-collections-with-pinning/7281/11?u=ralfjung). **/Update** +**Update:** @Diggsey [points out](https://github.com/rust-lang/rfcs/pull/2349#issuecomment-379230538) that we can still have a `PinRefCell` with a method like `fn get_pin(self: Pin>) -> Pin`, as long as the `PinRefCell` never gives out mutable references. So it turns out that combining interior mutability and pinning should work fine. Later, @glaebhoerl suggested we can even [combine `RefCell` and `PinRefCell` into one type if we dynamically track the pinning state](https://internals.rust-lang.org/t/safe-intrusive-collections-with-pinning/7281/11?u=ralfjung). **/Update** ## 4 Conclusion @@ -472,4 +472,9 @@ The situation around shared pinning is still open, and it seems we need to have Anyway, as usual, please [let me know what you think](https://internals.rust-lang.org/t/safe-intrusive-collections-with-pinning/7281)! +**Update:** Years later, I finally realized that there still is a major problem with intrusive collections -- it is basically impossible to have a by-reference iterator! +With the collection not actually owning the memory the elements are stored in, it is basically impossible to guarantee that during the iteration, elements do not get removed from the collection, which would invalidate the reference. +However, for many use-cases (like an intrusive queue) it is sufficient to just support by-value operations such as "enqueue" and "dequeue" over such a collection, and those can be given an entirely safe interface. +If iteration is required, then more advanced [techniques to control sharing](https://plv.mpi-sws.org/rustbelt/ghostcell/) seem to be needed. **/Update** + #### Footnotes