+++ /dev/null
----
-title: "Types as Contracts"
-categories: internship rust
----
-
-Over the last couple of weeks of my [internship]({{ site.baseurl }}{% post_url
-2017-05-23-internship-starting %}), I have been working on a proposal for the "Unsafe Code Guidelines".
-I got some very encouraging feedback at the [Mozilla All-Hands](https://internals.rust-lang.org/t/recapping-this-weeks-mozilla-all-hands/5465),
-and now I'd like to put this proposal out there and start having a discussion.
-<!-- MORE -->
-Warning: Long post is long. You have been warned.
-
-If you need more context on unsafe code guidelines, I suggest you read my post on [undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}) first.
-
-## It's in the Types
-
-Consider the following simple function:
-{% highlight rust %}
-fn simple(x: &mut i32, y: &mut f32) -> i32 {
- *x = 3;
- *y = 4.0;
- *x
-}
-{% endhighlight %}
-
-As discussed in my previous post, we would like an unsafe Rust program that calls this function with aliasing pointers to be UB.
-In my view, the reason this is UB is given by the *type* of the function and, in particular, its arguments: `&mut` expresses uniqueness, so the memory pointed to by an `&mut` must not alias with anything else.
-Borrowing some terminology from related fields, we can see the function type as stating a *contract*: We can call `simple` with non-aliasing pointers, and the function will behave.
-If we violate the contract, all bets are off.
-
-To realize this idea, we need to define, for every type, what is the contract associated with that type -- when are elements of this type *valid*?
-For some types, that's simple: `i32` is any four-byte value. `bool` is one byte, either `0` or `1`.
-Compound types like `(T, U)` are defined in terms of their constituent types: The first component must be a valid `T`, the second component a valid `U`.
-
-Where it gets interesting is the reference types: `&mut T` says that we have a (non-null, properly aligned) pointer.
-Whatever the pointer points to, must be a valid `T`.
-*Moreover*, the memory range covered by this pointer (whose extent is given by the size of `T`) is exclusively accessed by this function -- no concurrent thread performs any access, and when we call unknown code and do *not* pass on the reference, we can rely on that unknown code not accessing this memory either.
-
-Similarly, for `&T`, we can rely on no-one performing any write access to this memory.
-At the same time, the contract for `&T` also says that we ourselves will not write to this memory either.
-This is a nice example showing that contracts go both ways: The function can rely on the other party (the caller) having done their part by providing valid arguments, but the function also has to make guarantees itself.
-Another example of a guarantee provided by functions is their promise to provide a valid return value when they are done.
-
-My proposal for the unsafe code guidelines then is to validate the contract described by a function's types at the function boundaries, and to make it UB for the contract to be violated.
-In particular, `simple` would validate `x` and `y` immediately after being called, and it would be UB for them to alias.
-
-
-## Specification
-
-I have [previously written]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}):
-
-> I also think that tooling to *detect* UB is of paramount importance [...].
-In fact, specifying a dynamic UB checker is a very good way to specify UB!
-Such a specification would describe the additional state that is needed at run-time to then *check* at every operation whether we are running into UB.
-
-Following this, I will in the following describe how to check for contract violations.
-I am considering a MIR program; if a Rust program is supposed to be checked, it has to be translated to MIR first.
-(You can [read up on my explanation]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}) of why I think this is a good way to proceed.)
-The validation of reference types (in particular, their exclusive / read-only nature) will need some extra "instrumented" state -- also see [this previous post on the topic]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}).[^1]
-The extra state in my proposal consists of something akin to a reader-writer lock for every memory location.
-I am first going to describe these locks and how they affect program behavior, before explaining contract validation.
-
-[^1]: If at this point you are under the impression that all these previous posts have been collectively building up towards this proposal, you are not entirely wrong.
-
-### Memory locks
-
-### Validation
-