From: Ralf Jung Date: Thu, 5 Apr 2018 15:41:24 +0000 (+0200) Subject: explicit ownership: points_to_owned X-Git-Url: https://git.ralfj.de/web.git/commitdiff_plain/a08040916214f848a42b4b9fdb1038596f491714 explicit ownership: points_to_owned --- diff --git a/ralf/_posts/2018-04-05-a-formal-look-at-pinning.md b/ralf/_posts/2018-04-05-a-formal-look-at-pinning.md index e3c7412..4aed608 100644 --- a/ralf/_posts/2018-04-05-a-formal-look-at-pinning.md +++ b/ralf/_posts/2018-04-05-a-formal-look-at-pinning.md @@ -118,31 +118,31 @@ The full formalism we use in the paper is probably overkill, so I will go with a For example, the axiom (a) stated above would look as follows: ``` forall |'a, ptr| - borrowed('a, exists |bytes| ptr.points_to(bytes) && T.own(bytes)) + borrowed('a, exists |bytes| ptr.points_to_owned(bytes) && T.own(bytes)) -> T.shr('a, ptr) ``` I am using the usual mathematical quantifiers, with a Rust-y syntax (`forall |var| P` and `exists |var| P`), and `->` for implication. For disjunction (`||`) and conjunction (`&&`), I follow Rust. We also need some way to talk about ownership and contents of memory: -I will write `ptr.points_to_bytes(bytes)` (where `ptr: Pointer` and `bytes: List`) to express that `ptr` points to `bytes.len()` many bytes of memory containing the given `bytes` of data, and that moreover, we *own* that memory. +I will write `ptr.points_to_owned_bytes(bytes)` (where `ptr: Pointer` and `bytes: List`) to express that `ptr` points to `bytes.len()` many bytes of memory containing the given `bytes` of data, and that moreover, we *own* that memory. Ownership here means that we are free to read, write and deallocate that memory. Frequently, it is more convenient to not talk about raw lists of bytes but about some higher-level representation of the bytes, and not care about how exactly that data is laid out in memory. -For example, we might want to say that `ptr` points to `42` of type `i32` by saying `ptr.points_to(42i32)`. -We can define a suitable `points_to` as +For example, we might want to say that `ptr` points to `42` of type `i32` by saying `ptr.points_to_owned(42i32)`. +We can define a suitable `points_to_owned` as ``` -ptr.points_to(data: U) where List: TryInto := - exists |bytes| bytes.try_into() == Ok(data) && ptr.points_to_bytes(bytes) +ptr.points_to_owned(data: U) where List: TryInto := + exists |bytes| bytes.try_into() == Ok(data) && ptr.points_to_owned_bytes(bytes) ``` Here, we (ab)use the `TryInto` trait to convert a properly laid out list of bytes into something of type `U`. The conversion fails, in particular, if the list does not have exactly the right length. -If `U == List`, this is just the same as `ptr.points_to_bytes`. +If `U == List`, this is just the same as `ptr.points_to_owned_bytes`. -**Update:** In a previous version I called these predicates `mem_own_bytes` and `mem_own`. I decided that `points_to` was actually easier to understand, and it also matches the standard terminology in the literature, so I switched to using that term instead. **/Update** +**Update:** In a previous version I called these predicates `mem_own_bytes` and `mem_own`. I decided that `points_to_owned` was actually easier to understand, and it also matches the standard terminology in the literature, so I switched to using that term instead. **/Update** Finally, `borrowed('a, P)` says that `P` is only temporarily owned, i.e., borrowed, for lifetime `'a`. -`P` here is a *proposition* or *assertion*, i.e., a statement about what we expect to own. The axiom above is a proposition, as is `ptr.points_to(42i32)`. +`P` here is a *proposition* or *assertion*, i.e., a statement about what we expect to own. The axiom above is a proposition, as is `ptr.points_to_owned(42i32)`. You can think of propositions as a fancy version of `bool` where we can use things like quantifiers or borrowing, and talk about memory and ownership. Let us see how we can define the owned typestate of `Box` and mutable reference using this notation: @@ -152,10 +152,11 @@ A `Box` is a list of bytes that make up a pointer, such that we own the memor `&'a mut T` is almost the same, except that memory and `T.own` are only borrowed for lifetime `'a`. Formally: ``` -Box.own(bytes) := exists |ptr, inner_bytes| - bytes.try_into() == Ok(ptr) && ptr.points_to(inner_bytes) && T.own(inner_bytes) +Box.own(bytes) := exists |ptr, inner_bytes| bytes.try_into() == Ok(ptr) && + ptr.points_to_owned(inner_bytes) && T.own(inner_bytes) (&'a mut T).own(bytes) := exists |ptr| bytes.try_into() == Ok(ptr) && - borrowed('a, exists |inner_bytes| ptr.points_to(inner_bytes) && T.own(inner_bytes)) + borrowed('a, + exists |inner_bytes| ptr.points_to_owned(inner_bytes) && T.own(inner_bytes)) ``` It turns out that using `try_into` like we do above is actually a common pattern. @@ -164,9 +165,9 @@ To facilitate this, we will typically define `T.own(data: U)` for some `U` such This could look as follows: ``` Box.own(ptr: Pointer) := - exists |inner_bytes| ptr.points_to(inner_bytes) && T.own(inner_bytes) + exists |inner_bytes| ptr.points_to_owned(inner_bytes) && T.own(inner_bytes) (&'a mut T).own(ptr: Pointer) := - borrowed('a, exists |inner_bytes| ptr.points_to(inner_bytes) && T.own(inner_bytes)) + borrowed('a, exists |inner_bytes| ptr.points_to_owned(inner_bytes) && T.own(inner_bytes)) ``` The actual ownership predicate is then defined as ``` @@ -197,7 +198,7 @@ This gives us the expressivity we need to talk about immovable data: Moreover, if `self_ref` is set to `Some(data_ptr)`, then `data_ptr` is the address of the first field of the pair: ``` SelfReferential.pin(ptr) := exists |data: i32, self_ref: Option>| - ptr.points_to((data, self_ref)) && + ptr.points_to_owned((data, self_ref)) && match self_ref { Some(data_ptr) => data_ptr.as_ptr() == ptr.offset(0), None => True } ``` The most important part of this is the last line, saying that if `data_ptr` is a `Some`, it actually points to the first field (at offset `0`). @@ -235,7 +236,7 @@ This seems like a reasonable principle to require in general, so we add it to ou > {:type="a" start="2"} 2. If we own a pointer `ptr` pointing to a list of bytes `bytes` such that `T.own(bytes)` holds, then we can move to the pinned typestate and obtain `T.pin(ptr)`. Formally: ``` -forall |ptr, bytes| (ptr.points_to(bytes) && T.own(bytes)) -> T.pin(ptr) +forall |ptr, bytes| (ptr.points_to_owned(bytes) && T.own(bytes)) -> T.pin(ptr) ``` `PinBox::new` can now be easily implemented using `Box::new` and `PinBox::from`. @@ -243,7 +244,7 @@ forall |ptr, bytes| (ptr.points_to(bytes) && T.own(bytes)) -> T.pin(ptr) For `PinBox::as_pin`, we start with a `&'a mut PinBox`, which is a pointer-to-pointer, and return the inner pointer as `Pin<'a, T>`. This is justified because we start with a borrow for lifetime `'a` of a pointer to some `bytes` that satisfy (for the given lifetime) `PinBox.own(bytes)`: ``` -borrowed('a, exists |bytes| ptr.points_to(bytes) && PinBox.own(bytes)) +borrowed('a, exists |bytes| ptr.points_to_owned(bytes) && PinBox.own(bytes)) ``` Unfolding the `PinBox.own`, we can deduce that `bytes` actually form a pointer `inner_ptr` such that `T.pin(inner_ptr)` (for the given lifetime). This exactly matches our return type, so we're good! @@ -275,7 +276,7 @@ PinBox.shr('a, ptr) := exists |inner_ptr| ptr.points_to_read_only('a, inner_ptr) && T.shr('a, inner_ptr) ``` This requires a way to talk about memory that is read-only for the duration of some lifetime. -We assume we have a predicate `ptr.points_to_only_bytes('a: Lifetime, bytes: List)` saying that `ptr` points to `bytes.len()` many bytes of valid memory containing `bytes`, and that for lifetime `'a`, we are free to read that memory and it will not change. +We assume we have a predicate `ptr.points_to_read_only_bytes('a: Lifetime, bytes: List)` saying that `ptr` points to `bytes.len()` many bytes of valid memory containing `bytes`, and that for lifetime `'a`, we are free to read that memory and it will not change. We then define a convenient variant based on higher-level memory representations as usual: ``` ptr.points_to_read_only('a: Lifetime, data: U) where List: TryInto := @@ -296,7 +297,7 @@ Formally speaking: > **Definition 6: `Unpin`.** A type `T` is `Unpin` if from `T.pin(ptr)` we can deduce that we own the pointer `ptr` and it points to a list of bytes `bytes` such that `T.own(bytes)` holds. Formally: ``` -forall |ptr| T.pin(ptr) -> (exists |bytes| ptr.points_to(bytes) && T.own(bytes)) +forall |ptr| T.pin(ptr) -> (exists |bytes| ptr.points_to_owned(bytes) && T.own(bytes)) ``` Note that this is exactly the inverse direction of axiom (b) added in definition 2b: For `Unpin` types, we can freely move between the owned and pinned typestate. @@ -313,7 +314,7 @@ One advantage of this proposal is that it is *local*: Existing types (and new types) that are designed without considering `Pin` remain sound in the presence of this new typestate, even if we automatically derive the `Unpin` trait for these types. The reason this works is that we can always define `T.pin(ptr)` as saying that we fully own `ptr` pointing to `bytes` such that we have `T.own(bytes)`: ``` -T.pin(ptr) := exists |bytes| ptr.points_to(bytes) && T.own(bytes) +T.pin(ptr) := exists |bytes| ptr.points_to_owned(bytes) && T.own(bytes) ``` This satisfies the additional axioms (b) and (c) connecting the pinned typestate to the others, and it also satisfies `Unpin`. The latter is crucial, because it means we can automatically derive `Unpin` instances through the auto trait mechanism and do not have to review the entire ecosystem for `Pin`-compatibility.