link to sloonz's script

[web.git] / ralf / _posts / 2017-07-17-types-as-contracts.md
diff --git a/ralf/_posts/2017-07-17-types-as-contracts.md b/ralf/_posts/2017-07-17-types-as-contracts.md

deleted file mode 100644 (file)

index 8ad613f..0000000
--- a/ralf/_posts/2017-07-17-types-as-contracts.md
+++ /dev/null
@@ -1,402 +0,0 @@
----
-title: "Types as Contracts"
-categories: internship rust
-forum: https://internals.rust-lang.org/t/types-as-contracts/5562
----
-
-Over the last couple of weeks of my [internship]({{ site.baseurl }}{% post_url
-2017-05-23-internship-starting %}), I have been working on a proposal for the "Unsafe Code Guidelines".
-I got some very encouraging feedback at the [Mozilla All-Hands](https://internals.rust-lang.org/t/recapping-this-weeks-mozilla-all-hands/5465),
-and now I'd like to put this proposal out there and start having a discussion.
-<!-- MORE -->
-Warning: Long post is long.  You have been warned.
-
-If you need more context on this topic, I suggest you read my post on [unsafe code guidelines and undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}) first.
-
-## 1 It's in the Types
-
-Consider the following simple function:
-{% highlight rust %}
-fn simple(x: &mut i32, y: &mut f32) -> i32 {
-    *x = 3;
-    *y = 4.0;
-    *x
-}
-{% endhighlight %}
-
-As discussed in my previous post, we would like an unsafe Rust program that calls this function with aliasing pointers to be UB.
-I'd like to propose that the fundamental reason this is UB is the *type* of the function and, in particular, its arguments: `&mut` expresses uniqueness, so the memory pointed to by an `&mut` must not alias with anything else.
-Borrowing some terminology from related fields, we can see the function type as stating a *contract*:  The `&mut` say that `simple` expects to be called with non-aliasing pointers.
-If we violate the contract, all bets are off.
-
-To realize this idea, we need to define, for every type, what is the contract associated with that type. When are elements of any given type *valid*?
-For some types, that's simple: `i32` is any four-byte value. `bool` is one byte, either `0` or `1`.
-Compound types like `(T, U)` are defined in terms of their constituent types:  The first component must be a valid `T`, the second component a valid `U`.
-
-Where it gets interesting is the reference types: `&mut T` says that we have a (non-null, properly aligned) pointer.
-Whatever the pointer points to must be a valid `T`.
-*Moreover*, the memory range covered by this pointer (whose extent is given by the size of `T`) is exclusively accessed by this function -- no concurrent thread performs any access, and when we call unknown code and do *not* pass on the reference, we can rely on that unknown code not accessing this memory either.
-
-Similarly, for `&T`, we can rely on no-one performing any write access to this memory.
-At the same time, the contract for `&T` also says that we ourselves will not write to this memory either.
-This is a nice example showing that contracts go both ways:  The function can rely on the other party (the caller) having done their part by providing valid arguments, but the function also has to make guarantees itself.
-Another example of a guarantee provided by functions is their promise to provide a valid return value when they are done.
-
-My proposal for the unsafe code guidelines then is to validate the contract described by a function's types at the function boundaries, and to make it UB for the contract to be violated.
-In particular, `simple` would validate `x` and `y` immediately after being called, and it would be UB for them to alias.
-
-
-## 2 Specification Framework
-
-I have [previously written about my thoughts on undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}):
-
-> I also think that tooling to *detect* UB is of paramount importance [...].
-In fact, specifying a dynamic UB checker is a very good way to specify UB!
-Such a specification would describe the additional state that is needed at run-time to then *check* at every operation whether we are running into UB.
-
-Following this, I will in the following describe how we could check the contracts expressed by Rust's types for violations.
-I am considering a MIR program; if a Rust program is supposed to be checked, it has to be translated to MIR first.
-(You can [read up on my explanation]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}) of why I think this is a good way to proceed.)
-The validation of reference types (in particular, their exclusive / read-only nature) will need some extra "instrumented" state -- also see [this previous post on the topic]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}).[^1]
-The extra state in my proposal consists of something akin to a reader-writer lock for every memory location.
-I am first going to describe these locks and how they affect program behavior, before explaining contract validation.
-
-[^1]: If at this point you are under the impression that all these previous posts have been collectively building up towards this proposal, you are not entirely wrong.
-
-This is not (yet) a proper specification in that some important details are still left open.
-The goal is to lay down a framework as the basis for discussion.
-Many things already work as expected, and hopefully the remaining issues can be sorted out by tuning some of the details.
-
-### 2.1 Memory Locks
-
-Every location comes with an additional reader-writer "lock".
-These locks have nothing to do with concurrency, in fact they are more like `RefCell`.
-Locks are used to verify the guarantees provided by references:
-Shared references point to memory that is not mutated by anyone, and mutable references point to memory that is not *accessed* (read from or written to) by anyone *else*.
-Notably, however, unlocked memory is free to be accessed by anyone -- nobody is making any particular assumptions about this memory.
-
-To track these locks, we store a `locks: Vec<LockInfo>` for every location, with `LockInfo` defined as:
-{% highlight rust %}
-struct LockInfo {
-    write_lock: bool,
-    frame: usize,
-    lft: Option<Lifetime>,
-}
-{% endhighlight %}
-(Notice that an implementation of this specification is free to track locks in a different way, as long as the effective behavior is the same.
-The reason I am giving concrete types here is just to make the specification more precise.)
-
-When a lock is acquired, we record which stack frame is the owner of this lock (starting at 0 for `main`, and counting up).
-Now, on every read access, we check whether there is a write lock of this location held *by another function*.
-If yes, the access is UB.
-Furthermore, write accesses and acquiring a write lock are UB if there is a read lock held *or* if there is a write lock held by another function.
-Finally, acquiring a read lock is UB if there is a write lock held by anyone, including the function requesting a read lock.
-
-Locks are typically released automatically when they expire.
-To this end, the function acquiring a lock can optionally specify the *lifetime* of the lock.
-Not specifying a lifetime implies that the lock is held until the function returns.
-For the purpose of this specification, it does not matter what exactly a lifetime is; it is only relevant at which point it *ends*.
-Lucky enough, MIR recently gained [`EndRegion` statements](https://github.com/rust-lang/rust/pull/39409) that we can use for this purpose:
-When `EndRegion(lft)` is executed, all locks with lifetime `lft` are released.
-Beyond this, a write lock can also be explicitly released by the function that acquired it, even before the lock's lifetime ends.
-(It does not turn out to ever be necessary to do this for read locks in my proposal.)
-
-There is one more mechanism we need to introduce:  Acquired write locks can be *suspended* for a lifetime.
-This means the lock is temporarily released, but meant to be reacquired later.
-To this end, the `LockInfo` is removed from the location's `locks` and put into a `HashMap<Lifetime, (Location, LockInfo)>` which is stored somewhere separately from the program's memory.
-When the `EndRegion` corresponding to the given lifetime is reached, the lock is reacquired and put back into the `locks` of the given location.
-
-### 2.2 Validation
-
-With the lock mechanism established, we can now look at the core of the proposal:  Validating the contract described by a type.
-This is done by a new MIR statement, `Validate`, which takes two arguments:  An *operation* (which we will discuss shortly), and a list of `(Lvalue, Ty)` pairs defining which lvalues is supposed to be validated at which type.
-Crucially, this is a non-erased type -- all the lifetime information must be present.
-
-The most important validation operation is *acquire validation*.
-As the name indicates, this is about acquiring the appropriate locks.
-It also otherwise makes sure that values are valid at the given type, e.g., references have to be non-NULL and aligned.
-
-The core of acquire validation is an operation with a signature roughly like this:
-{% highlight rust %}
-fn acquire_validate(lval: Lvalue, ty: Ty, mutbl: Mutability, lft: Option<Lifetime>)
-{% endhighlight %}
-Why are mutability and lifetime showing up here?
-The reason is that reference types have a "transitive" effect down the type hierarchy: `&T` makes everything reachable through this reference read-only, even if `T` involves `&mut`.
-Similarly, in case of nested references, it is the *outermost* reference that defines the lifetime for which access through this pointer is granted.
-(The compiler makes sure that inner references always have longer lifetimes than outer references.)
-
-`acquire_validate` recursively traverses the type hierarchy.
-When it encounters a primitive type like `i32`, it acquires a lock of `size_of::<ty>` many locations starting at `lval` for lifetime `lft` -- this will be a write or read lock depending on whether `mutbl` says this data is mutable or not.[^2]
-In other words, memory is locked "at the leaves" of the type structure, rather than locking immediately when we see a reference type.
-We will come back later to why this is necessary.
-Validation also makes sure that the value stored in memory is valid at the given type.
-For a `u32`, this just amounts to checking that the memory is properly initialized.
-
-[^2]: For simplicity, we pretend lvalues are just locations in memory.  Actually, lvalues can also refer to variables not in memory (like non-addressable stack locals), but these do not need to be locked.  Furthermore, lvalues for unsized types carry the additional information stored in a fat pointer, i.e., a length (for slices) or a vtable (for trait objects); this is handled appropriately when validating the respective types.
-
-For compound types like tuples, `struct` or `enum`, validation proceeds recursively on the fields.
-In particular, the enum discriminant is checked to be in-range (in particular, this empty enums always invalid).
-However, because the recursive validation will only lock memory that's actually covered by a field, we also have to acquire the appropriate lock for padding bytes and the enum discriminant here.
-
-Finally, at a reference type, two things happen.
-First, the reference itself is stored in memory somewhere; this memory has to be locked just like the validation of `i32` locks the memory used to store the integer.
-The reference is also checked to be non-NULL and properly aligned for the type it points to.
-Furthermore, validation proceeds recursively after *dereferencing* the reference.
-Crucially, the `mutbl` and `lft` for this recursive call are taking the reference type into account:
-If `lft` was `None` and this reference's lifetime ends within the function (i.e., there is a corresponding `EndRegion` somewhere), it is now set to the reference's lifetime.
-If `mutbl` was mutable, it becomes immutable when following a shared reference.
-
-There are two other validation operations besides acquire validation: release validation, and suspend validation.
-These serve to release the appropriate locks before handing references to foreign code, and to handle the operation of taking a reference.
-(Bear with me for just a little longer, we will see examples of this soon.)
-Suspend validation takes an additional parameter, namely the lifetime for which locks are to be suspended.
-Both operations recursively follow the type just like acquire validation, and for all memory they visit, they release/suspend *write* locks.
-Read locks are not touched, which corresponds to the fact that shared references are `Copy` and hence never "given up":
-Shared references always remain usable until their lifetime expires, at which point the read lock is released automatically.
-
-Validation statements are emitted during MIR construction in the following places:
-- At the beginning of the function, we perform acquire validation of the arguments.
-- When taking a reference, we suspend the referee for the reference lifetime and acquire the referent.
-- When calling another function, we release the arguments and acquire the return value.
-
-### 2.3 Examples
-
-After all this rather dry theory, let us look at some examples demonstrating how validation works, and how it enables optimizations.
-
-#### Validating function arguments
-
-Here's the sample code from above with validation annotated explicitly:
-{% highlight rust %}
-fn simple(x: &mut i32, y: &mut f32) -> i32 {
-    Validate(Acquire, [x, y]);
-    *x = 3;
-    *y = 4.0;
-    *x
-}
-{% endhighlight %}
-For simplicity's sake, I just pretend that validation commands were present not just in MIR, but also in the Rust surface language. Also, while validation takes *pairs* of lvalues and types, the types can be inferred, so I am only listing lvalues here.
-
-Let us now see what happens when `x` and `y` alias.
-I mentioned above that we want this case to be UB, because we want to be able to reorder the two stores.
-Indeed, it will turn out that the validation describes above triggers UB here.
-
-There are no references taken and no other function is called, so the only validation that happens is acquiring the arguments at the beginning of the function.
-This validates our contract with the caller.
-`x` is of reference type, so validation will first acquire the write lock for the memory used to store `x` itself (that's somewhere on the stack).
-Next, it will follow the reference and recursively validate whatever `*x` is valid at type `i32`.
-In the recursive call, the `lft` will still be `None` because the lifetime of the reference outlives the function call.
-The recursive call then acquires the write lock for `*x` and checks that memory is initialized.
-This completes validating `x`.
-Validation of `y` proceeds exactly the same way.
-In particular, validation will acquire a write lock for `*y`.
-However, since `x` and `y` alias, this will overlap the write lock that was acquired for `*x`!
-This is how we get the UB that we were looking for.
-
-In other words, acquiring locks for everything we can reach through references implicitly checks for aliasing, making sure that mutable references do not overlap with anything else.
-
-#### Validating function calls
-
-The second example demonstrates how validation, and memory locks in particular, let us reason about the actions of unknown external functions:
-{% highlight rust %}
-// Defined elsewhere: fn fun(x: &i32, z: &mut (i32, i32)) { ... }
-
-fn example(x: &i32, y: &mut i32, z: &mut (i32, i32)) {
-    Validate(Acquire, [x, y, z]);
-    let x_val = *x;
-    let y_val = *y;
-
-    Validate(Release, [x, z]);
-    fun(x, {z});
-
-    let x_val_2 = *x;
-    *y = y_val;
-}
-{% endhighlight %}
-We would like to optimize this code and eliminate the write to `y` at the end of the function, arguing that we are just writing back the value that's already stored there.
-The reason we think we can do that is that we hold an exclusive reference to `*y`.
-We do not pass `y` to `fun`, and there most not be any other reference to this memory, so `fun` cannot touch `y`.
-And indeed, the validation rules make it so that it is UB for `fun` to access `y`.
-
-Furthermore, we want to eliminate the load from `x` at the end, and instead re-use `x_val`, arguing that since `x` is a shared reference, it is UB for `nop` to change its value.
-
-In the beginning of `example`, during acquire validation, we acquire the read lock for `*x` and the write lock for `*y` and `*z`.
-The release validation before calling `fun` releases all write locks covered by `x` and `z`.
-For `x`, this doesn't actually do anything, but the write lock on `z` is released here.
-(We assume there is no implicit reborrowing of `z` going on here. The curly braces make it so that `z` is moved, not borrowed, to `fun`.)
-So, when `fun` is called, we hold the read lock for `x` and the write lock for `y`.
-Consequently, if `fun` were to access `y` or write to `x`, we know that's UB -- so we can rely on this not to happen and perform our desired optimizations.
-
-Notice that this optimization is extremely hard to perform e.g. in C, because to know that `fun` cannot access `y`, the compiler needs to do whole-program alias analysis -- a notoriously hard problem.
-In contrast, Rust's type system allows us to perform such optimizations intra-procedurally, without looking at other functions' code at all.
-
-#### Validating taking a reference
-
-In the last example, we deliberately *moved* `z` to `fun` to side-step any discussions about borrowing.
-However, typically, you would actually *reborrow* `z`, leading to additional validation calls around the operation that takes a reference.
-We will look at that case now.
-Let us also say that `fun` actually returns a reference with the same lifetime as `z`.
-Overall, our code now looks like this (with all reborrowing being explicit):
-{% highlight rust %}
-// Defined elsewhere: fn fun<'z>(x: &i32, z: &'z mut (i32, i32)) -> &'z mut i32  { ... }
-
-fn example(x: &i32, y: &mut i32, z: &mut (i32, i32)) {
-    Validate(Acquire, [x, y]);
-    let x_val = *x;
-    let y_val = *y;
-
-    {   // 'inner begins here
-        Validate(Suspend('z), [z]);
-        let z_for_fun = &'inner mut z;
-        Validate(Acquire, [z_for_fun]);
-
-        Validate(Release, [x, z_for_fun]);
-        let z_inner : &'inner mut i32 = fun(x, {z_for_fun});
-        Validate(Acquire, [z_inner]);
-
-        *z_inner = 42;
-    }   // 'inner ends here
-
-    let x_val_2 = *x;
-    *y = y_val;
-}
-{% endhighlight %}
-Again we would like to eliminate the write to `y` and the read from `x`.
-To eliminate the write to `y` as being redundant, we have to prove, in particular, that the write to `z_inner` cannot affect `*y`.
-
-As usual, we start by acquiring the read lock for `*x` and the write lock for `*y` and `*z`.
-When `z` gets reborrowed to be passed to `fun` with lifetime `'inner`, we *suspend* our lock of `*z` for this lifetime.
-This models the fact that, no matter what we do with the newly created reference, we know that when `'inner` ends, we get our exclusive access back.
-After the suspension, we don't hold a lock any more for what is now `*z_for_fun`, so we immediately acquire a write lock again.
-However, following the type of `z_for_fun`, we will acquire this lock only for lifetime `'inner`.
-(Releasing and then acquiring the write lock may sound somewhat redundant.
-However, notice that, if we were to take a shared reference, we would now acquire a read lock rather than a write lock.)
-
-Next, we call `fun`.
-At this point, we release our write lock on `*z_for_fun` again.
-However, notice that the suspended lock of `*z` stays around -- so while we gave up control of `z` (aka `z_for_fun`) for lifetime `'inner`, we still know that we will get control back when `'inner` ends!
-`fun` is now free to do whatever it wants, as long as it does not violate our still-existing write lock of `*y` and the read lock of `*x`.
-
-When `fun` returns, we obtain a new mutable reference.
-At this point, we again run acquire validation.
-In this case, we want to ensure that `z_inner` does not alias `y`.
-The locks ensure that `fun` cannot possibly change `y`, but we (`example`) can -- and if writing to `z_inner` were to change `*y`, that access would come from the stack frame holding the lock on `y`, so it would be permitted!
-However, we do want to still optimize away the write to `y` at the end of `example` as being redundant.
-To this end, we acquire-validate all data returned to us from other functions, making sure that references we obtain this way to not conflict with other references we already hold.
-Following the type of `z_inner`, we only acquire this lock for `'inner`.
-
-Another way to look at acquiring the return value is that we are validating the contract we have with `fun`: The contract says that `fun` should return a reference to us that comes with the uniqueness guarantees of `&mut`, so we better make sure that `fun` actually delivered what it said it would.
-
-Finally, `'inner` ends.
-Multiple things happen now:
-First, our lock of `z_inner` expires and is hence released.
-(We already released the lock of `z_for_fun` ourselves, so nothing is to be done there.)
-Next, remember that we suspended our lock of `z` for `'inner`; so now is the time that this lock gets reacquired.
-In other words, as the lifetime of the reborrow from `z` expires, we release all locks of memory that is tied to that lifetime, and then re-gain full exclusive access to `z`.
-
-When we come to the write to `*y`, we know that this write is redundant because (a) we held the write lock for `*y` all the time, so nobody else could have written to it without triggering UB, and (b) we know `z_inner` does not alias with `y` (or else we would already have UB), so we ourselves have not written to `*y` either.
-Hence, we may optimize away the write.
-
-### 2.4 `UnsafeCell`
-
-This concludes the examples.
-I hope you now have an idea of how memory locks and validation affect the execution when "bad" references are passed around, and how they validate the assumptions we need to make for optimizations that reorder memory accesses.
-
-One topic that did not come up yet is interior mutability.
-If a function takes an `x: &Cell<i32>`, following the rules above, it will acquire a read lock of `*x` for the duration of the function call, making `*x` immutable.
-Clearly, we do not want to do that -- calling `x.set` *will* actually mutate `*x`, and mutating through a shared reference is exactly the point of using `Cell`!
-
-Lucky enough, the compiler *already* says that interior mutability is only allowed via [`UnsafeCell`](https://doc.rust-lang.org/beta/core/cell/struct.UnsafeCell.html).
-We can use this for our purposes: To adjust validation for interior mutability, we *stop* our recursive descent and do not do anything when reaching an `UnsafeCell`.
-In particular, no locks are acquired.
-This justifies calling `set` on a shared reference and having the value changed.
-Of course, it also means we cannot do some of the optimizations we discussed above -- but that's actually exactly what we want!
-These optimizations are unsound in the presence of interior mutability.
-
-Notice that for this to work, it is crucial that we acquire locks "at the leaves" of the type structure, and not just when we encounter a reference.
-For example, if we validate something at type `&(i32, Cell<i32>)`, we *do* want to acquire a read lock for the first component, but not for the second.
-
-### 2.5 Unsafe Code
-
-There's one last issue that I want to cover before (finally) concluding this post: Unsafe code.
-So far, we have only considered entirely safe code, and we have seen how validation and memory locks enforce the guarantees associated with reference type.
-The type checker already ensures that safe code will never trigger a lock failure -- actually, you can see the validation statements as some form of "dynamic borrow checking", enforcing the same rules as the borrow checker does statically.
-The only way that these checks could ever fail is for unsafe code to feed "bad" values into safe code, and that is exactly the point:
-These proposed "unsafe code guidelines" restrict what unsafe code can do when interfacing with safe code.
-
-However, *within* unsafe code, we have to give programmers a little more leeway.
-We cannot always make as strong assumptions as we did in safe code:
-For example, let us look at `mem::swap` (somewhat desugared):
-{% highlight rust %}
-// Defined elsewhere: unsafe fn ptr::swap_nonoverlapping<T>(x: *mut T, y: *mut T) { ... }
-
-pub fn swap<T>(x: &mut T, y: &mut T) {
-    Validate(Acquire, [x, y]);
-    let x = x as *mut T;
-    let y = y as *mut T;
-    unsafe {
-        Validate(Release, [x, y]);
-        ptr::swap_nonoverlapping(x, y, 1);
-    }
-}
-{% endhighlight %}
-In the beginning, we acquire the write lock for `x` and `y`, establishing that the two pointers do not alias -- this is exactly as expected.
-However, when we release the arguments to `ptr::swap_nonoverlaping`, nothing actually happens:
-We are only passing raw pointers that do not come with any guarantees.
-In particular, we keep the write locks on `x` and `y`!
-Now, that's a problem -- `ptr::swap_nonoverlaping` will of course write to both of them when it swaps their content, triggering UB.
-
-First of all, let me note that this example actually demonstrates that we are validating the borrow checking gurantees correctly.
-As far as the borrow checker is concerned, `x` and `y` have never been borrowed or moved to anyone else, they were fully owned by `swap` all the time.
-
-The trouble is that the ownership transfer to `ptr::swap_nonoverlaping` is not described by its type, and hence cannot be known to the compiler.
-This is possible because `ptr::swap_nonoverlapping` is an *unsafe* function that can make more assumptions than its type says, and rely on the caller to uphold these assumptions.
-The compiler has no way to know what these assumptions are.
-
-To accommodate such code, I suggest that we emit fewer validation instruction when building MIR for functions containing unsafe blocks.
-For example, we could imagine that if a function contains an unsafe block, instead of the usual validation, we only perform an acquire in the beginning of the function, immediately followed by a release.
-This would verify that our arguments are well-formed, but would otherwise acknowledge that we just do not know how ownership is flowing inside such a function.
-Of course, this also disables some optimizations -- but maybe these are optimizations that we do not want to perform anyway in unsafe code, because they could end up breaking the program.
-
-Anyway, this part of my proposal is certainly not yet final.
-This topic needs input from many people so as to safely cover the unsafe code people write while still (hopefully) permitting the optimizations people expect to happen.
-One potential middle-ground here is having a function attribute that controls which validation statements are emitted, or even providing some form of intrinsics for programmers to explicitly perform validation.
-
-## 3 Conclusion
-
-I have described a mechanism to validate the contract expressed by Rust's types, in particular, the uniqueness and read-only guarantees provided by reference types.
-This validation describes what (safe) code can assume about its variables at run-time, and it enables some code motion optimizations that would otherwise require whole-program alias analysis.
-
-
-I said before that I think providing a way to *detect* UB is crucial, and the framework laid out above is designed with this in mind.
-In fact, I have been working on a miri-based implementation of this mechanism:
-[This rustc branch](https://github.com/RalfJung/rust/tree/mir-validate) adds the new `Validate` statement to MIR and adds a MIR pass to insert validation statements as described above, while [this miri branch](https://github.com/RalfJung/miri/tree/mir-validate) adds memory locks to miri and implements type-based validation.
-(Both of these branches are subject to frequent rebasing, so expect force pushes if you track them.)
-The implementation is not yet complete, and some of the terminology changed (it turns out writing things down helps you find good names :), so don't be confused if it looks a little different.
-Other than that though, feel free to play around and provide feedback.
-I hope the implementation can help us check whether unsafe code follows the rules I laid down here, and whether these rules match programmer intuition.
-(In fact, this is how I figured out that `mem::swap` violates the most strict interpretation of the rules -- I did expect a violation somewhere in libstd's unsafe code, but had no idea where it would show up.) 
-We could even have a test suite to aid in fine-tuning the rules.
-
-I also said that "we should strive for programmers' intuition agreeing with the [...] compiler".
-This is of course hard to measure, but one property that I think goes a long way here is that in this proposal, pointers can remain "thin".
-I probably should write a whole separate blog post on this topic, but the short version is that in my proposal, if a function has permission to access certain region of memory, then it does not matter how it performs the access -- if there are multiple pointers to this region around, they are all equally valid to use.
-This is in contrast to e.g. C's `restrict` or type-based alias analysis, which make a distinction based on *which pointer* is used to access certain regions of memory.
-Such models are fairly hard to write checkers for, as witness by there being no sanitizer.
-Moreover, they typically (from what I have seen so far) have some trouble supporting integer-pointer casts.
-If pointers have an "identity" beyond the address they point to, then which "identity" does a pointer that's casted from an integer obtain?
-I am not saying it is impossible to solve this issue, but it is a source of complication.
-It is also quickly a source of programmer confusion.
-In my experience, programmers that use pointers casted from integers typically like to think of pointers as being "just integers" and not having any "identity" beyond the address they point to (and indeed, the fact that such casts work seems to indicate that they are right).
-I would be much less worried about unsafe Rust code being broken by optimizations if we can make it so that this mental model is actually right.
-Wouldn't it be great if we could add "Has the simple model of pointers that everybody incorrectly *thinks* C has" to the list of Rust selling points?
-I think this aligns well with Rust's slogan "Hacking without fear" -- and without compromise; remember we still get all the optimizations!
-
-Congratulations!
-You have reached the end of this blog post.
-Thanks a lot for reading.
-If you have any comments, please raise them -- the entire purpose of this post is to get the proposal out there so that we can see if this approach can suit the needs of Rust programmers and compiler writers alike.
-If there are explanations you think are missing or confusing, please bring that up as well; I may then update this post accordingly.
-The concrete proposal will sure need some fixing here and there, but my hope is that the general approach of a contract-like, type-driven validation mechanism ends up being useful.
-So, keep the comments flowing -- and safe hacking.
-
-#### Footnotes