The "scope" in the title refers to the extent of the code that has to be manually checked for correctness, once `unsafe` is used.
What I am saying is that the scope of `unsafe` is larger than the `unsafe` block itself.
-It turns out that the underlying reason for this observation is also a nice illustration for the concept of *semantic types* that comes up in my [work on formalizing Rust]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}) (or rather, its type system).
+It turns out that the underlying reason for this observation is also a nice illustration for the concept of *semantic types* that comes up in my [work on formalizing Rust]({% post_url 2015-10-12-formalizing-rust %}) (or rather, its type system).
Finally, this discussion will once again lead us to realize that we rely on our type systems to provide much more than just type safety.
**Update (Jan 11th):** Clarified the role of privacy; argued why `evil` is the problem.
Still, `MyType::evil` is a perfectly benign function (despite its name).
How can this be?
-Remember that in a [previous blog post]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}), I argued that types have a *semantic* aspect.
+Remember that in a [previous blog post]({% post_url 2015-10-12-formalizing-rust %}), I argued that types have a *semantic* aspect.
For example, a function is semantically well-typed if it *behaves* properly on all valid arguments, independently of how, *syntactically*, the function body has been written down.
The reason for `MyType::evil` being fine, while `Vec::evil` is bad, is that semantically speaking, `Vec` is very different from `MyType` -- even though they look so similar.
To formally establish safety, one would have to literally go over the entire program and prove that it doesn't misuse `Vec`.
The safety promise of Rust would be pretty much useless.
-This should not be entirely surprising if you read the aforementioned [post about formalizing Rust's type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}), where I already argued that a proof of (syntactic) type safety does not help to justify safety of most Rust programs out there.
+This should not be entirely surprising if you read the aforementioned [post about formalizing Rust's type system]({% post_url 2015-10-12-formalizing-rust %}), where I already argued that a proof of (syntactic) type safety does not help to justify safety of most Rust programs out there.
I am now making a similar point, coming from a different angle.
The fact that Rust programmers *can* use `Vec` and many other types without much care is a property of the type system that is independent of type safety.
This week, I have been at the Paris Rust Meetup. Meeting all sorts of Rust people was great fun, and the Mozilla offices in Paris are absolutely impressive.
You should totally check them out if you have a chance.
-On that meetup, I gave a short talk about the current status of my [formalization of the Rust type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}).
+On that meetup, I gave a short talk about the current status of my [formalization of the Rust type system]({% post_url 2015-10-12-formalizing-rust %}).
<!-- MORE -->
That was rather spontaneously arranged just two days before the meetup, but it was definitely worth it.
If you are interested in what this work is all about and where we are currently, you can go watch the recording of the talk at [Air Mozilla](https://air.mozilla.org/rust-paris-meetup-35-2017-01-19/).
(But don't tell Aaron, my manager, or Niko, my mentor -- I was of course supposed to be working. ;)
So, what concretely will I be doing in the next three months?
-Based on [my prior posts]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}) and me being in the [unsafe code guidelines strike team](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) (which I totally forgot to announce here... I'm not good at this blogging thing, am I?), it should not be surprising that my main project is going to be related to unsafe code.
+Based on [my prior posts]({% post_url 2016-01-09-the-scope-of-unsafe %}) and me being in the [unsafe code guidelines strike team](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) (which I totally forgot to announce here... I'm not good at this blogging thing, am I?), it should not be surprising that my main project is going to be related to unsafe code.
More specifically, we want to figure out ways to specify what unsafe code is and what it is not allowed to do (i.e., when its behavior is well-defined, and when it is undefined).
Ultimately, the way I think this should be specified (and lucky enough, my new bosses agree on this) is by defining, for every possible Rust (or rather, MIR) program one could write, what that program is going to do when it is executed.
It's now been two weeks since my internship started (two weeks already, can you believe it?).
In other words, if I want to post "weekly or bi-weekly" updates, I better write one today ;) .
-As [already mentioned]({{ site.baseurl }}{% post_url
+As [already mentioned]({% post_url
2017-05-23-internship-starting %}), the goal for this internship is to
experiment with
[unsafe code guidelines](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864)
reddit: /rust/comments/6gavfe/how_mutexguard_was_sync_when_it_should_not_have/
---
-A couple of weeks ago, our ongoing effort to [formalize Rust's type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}) lead to us actually discovering a bug in the Rust standard library:
+A couple of weeks ago, our ongoing effort to [formalize Rust's type system]({% post_url 2015-10-12-formalizing-rust %}) lead to us actually discovering a bug in the Rust standard library:
`MutexGuard` implemented `Sync` in cases where it should not. This could lead to data races in safe programs. Ouch.
<!-- MORE -->
Ideally, types like `MutexGuard<T>` would *not* get these automatic implementations of `Send` and `Sync`.
That would make sure that, if some type incorrectly implements these traits, at least it comes from some piece of incorrect `unsafe` somewhere.
This is exactly the point of `unsafe`: Marking clearly the points in the code that have to be audited extra-carefully.
-The reality is [somewhat more subtle than this]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}), but nevertheless, `unsafe` does its job pretty well -- except for when it comes to auto traits.
+The reality is [somewhat more subtle than this]({% post_url 2016-01-09-the-scope-of-unsafe %}), but nevertheless, `unsafe` does its job pretty well -- except for when it comes to auto traits.
To make things worse, imagine what happens when auto traits are stabilized, so that crates can introduce their own auto traits.
Some of them may want to express additional guarantees that the compiler cannot check, so some of these auto traits may be unsafe.
Again, I think the solution here is that types like `MutexGuard<T>` should not automatically implement unsafe auto traits.
What do I mean by "types like `MutexGuard<T>`"?
The point that makes `MutexGuard<T>` special here is that there is an invariant attached with this type that all instances of `MutexGuard<T>` satisfy.
-(I describe this concept of types with additional invariants in more detail [in one of my previous blog posts]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %})).
+(I describe this concept of types with additional invariants in more detail [in one of my previous blog posts]({% post_url 2016-01-09-the-scope-of-unsafe %})).
My proposal is to *tell the compiler* that this is the case.
If the compiler understands that a type is "more than what it seems to be", we could know during compilation that we have to be more cautious when handling this type -- and in particular, we could have such types *not* automatically implement unsafe auto traits.
In fact, I [proposed such an annotation](https://internals.rust-lang.org/t/pre-rfc-unsafe-types/3073) previously for different reasons, [and so did others](https://github.com/rust-lang/rfcs/issues/381).[^1]
<!-- MORE -->
-This paper is the result of almost two years of work by the [RustBelt](https://plv.mpi-sws.org/rustbelt/) research project to [formalize Rust's type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}).
+This paper is the result of almost two years of work by the [RustBelt](https://plv.mpi-sws.org/rustbelt/) research project to [formalize Rust's type system]({% post_url 2015-10-12-formalizing-rust %}).
The paper is now undergoing peer review; some time in fall we will be notified whether the paper got accepted or not.
In case you wondered which "important libraries" we verified, the full list is `Rc`, `Arc`, `Cell` (including [`alias::one`](https://huonw.github.io/alias/alias/fn.one.html), which was recently [accepted into the standard library](https://github.com/rust-lang/rfcs/pull/1789)), `RefCell`, `Mutex`, `RwLock`, `thread::spawn`, `mem::swap`, [`take_mut::take`](https://docs.rs/take_mut/0.1.3/take_mut/fn.take.html) as well as converting `&&T` into `&Box<T>` (inspired by [Abomonation](http://www.frankmcsherry.org/serialization/2015/05/04/unsafe-at-any-speed.html)).
-Our model of Rust is somewhat simplified (e.g., we don't support unwinding after panics); still, we were actually able to [find a real bug]({{ site.baseurl }}{% post_url 2017-06-09-mutexguard-sync %}).
+Our model of Rust is somewhat simplified (e.g., we don't support unwinding after panics); still, we were actually able to [find a real bug]({% post_url 2017-06-09-mutexguard-sync %}).
For all the details, have a [look at the paper](https://www.mpi-sws.org/~dreyer/papers/rustbelt/paper.pdf).
If that's not enough details for your taste, you can also check out [all our formal proofs](https://gitlab.mpi-sws.org/FP/LambdaRust-coq/).
## Undefined Behavior in Rust
Coming back to Rust, where are we at?
-Safe Rust is [free from UB]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}), but we still have to worry about unsafe Rust.
+Safe Rust is [free from UB]({% post_url 2017-07-08-rustbelt %}), but we still have to worry about unsafe Rust.
For example, what if unsafe code crafts two aliasing mutable references (something that is prevented in safe Rust) and passes them to our `simple` function?
This violates the assumptions we made when we reordered the two writes.
If we want to permit this optimization (which we do!), we have to argue why it cannot change program behavior.
To this end, the specification should be written in a way that such tooling is feasible.
In fact, specifying a dynamic UB checker is a very good way to specify UB!
Such a specification would describe the additional state that is needed at run-time to then *check* at every operation whether we are running into UB.
-It is with such considerations in my mind that I have previously written about [miri as an executable specification]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}).
+It is with such considerations in my mind that I have previously written about [miri as an executable specification]({% post_url 2017-06-06-MIR-semantics %}).
-Coming up next on this channel: During my [internship]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}), I am working on such a specification.
+Coming up next on this channel: During my [internship]({% post_url 2017-05-23-internship-starting %}), I am working on such a specification.
My ideas are concrete enough now that I can write down a draft, which I will share with the world to see what the world thinks about it.
-**Update:** [Writing down has happened]({{ site.baseurl }}{% post_url 2017-07-17-types-as-contracts %}).
+**Update:** [Writing down has happened]({% post_url 2017-07-17-types-as-contracts %}).
**Update:** Clarified "Shifting Responsibility".
forum: https://internals.rust-lang.org/t/types-as-contracts/5562
---
-Over the last couple of weeks of my [internship]({{ site.baseurl }}{% post_url
+Over the last couple of weeks of my [internship]({% post_url
2017-05-23-internship-starting %}), I have been working on a proposal for the "Unsafe Code Guidelines".
I got some very encouraging feedback at the [Mozilla All-Hands](https://internals.rust-lang.org/t/recapping-this-weeks-mozilla-all-hands/5465),
and now I'd like to put this proposal out there and start having a discussion.
<!-- MORE -->
Warning: Long post is long. You have been warned.
-If you need more context on this topic, I suggest you read my post on [unsafe code guidelines and undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}) first.
+If you need more context on this topic, I suggest you read my post on [unsafe code guidelines and undefined behavior]({% post_url 2017-07-14-undefined-behavior %}) first.
## 1 It's in the Types
## 2 Specification Framework
-I have [previously written about my thoughts on undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}):
+I have [previously written about my thoughts on undefined behavior]({% post_url 2017-07-14-undefined-behavior %}):
> I also think that tooling to *detect* UB is of paramount importance [...].
In fact, specifying a dynamic UB checker is a very good way to specify UB!
Following this, I will in the following describe how we could check the contracts expressed by Rust's types for violations.
I am considering a MIR program; if a Rust program is supposed to be checked, it has to be translated to MIR first.
-(You can [read up on my explanation]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}) of why I think this is a good way to proceed.)
-The validation of reference types (in particular, their exclusive / read-only nature) will need some extra "instrumented" state -- also see [this previous post on the topic]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}).[^1]
+(You can [read up on my explanation]({% post_url 2017-05-23-internship-starting %}) of why I think this is a good way to proceed.)
+The validation of reference types (in particular, their exclusive / read-only nature) will need some extra "instrumented" state -- also see [this previous post on the topic]({% post_url 2017-06-06-MIR-semantics %}).[^1]
The extra state in my proposal consists of something akin to a reader-writer lock for every memory location.
I am first going to describe these locks and how they affect program behavior, before explaining contract validation.
forum: https://internals.rust-lang.org/t/https-www-ralfj-de-blog-2017-08-11-types-as-contracts-evaluation-html/5753
---
-Some weeks ago, I described [Types as Contracts]({{ site.baseurl }}{% post_url 2017-07-17-types-as-contracts %}) as an approach for how to go about defining Rust's aliasing-related [undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}).
+Some weeks ago, I described [Types as Contracts]({% post_url 2017-07-17-types-as-contracts %}) as an approach for how to go about defining Rust's aliasing-related [undefined behavior]({% post_url 2017-07-14-undefined-behavior %}).
One key property of this approach is that it is *executable*, which means we can actually have a program (in this case, [miri](https://github.com/solson/miri/)) tell us whether some particular Rust test case exhibits undefined behavior or not.
I have since then spent most of my time completing and refining this implementation, and running it on miri's test suite to exercise various bits of the standard library and see whether they are actually following the rules I have been suggesting.
Unsurprisingly, some of miri's test cases started to fail once I properly implemented the rules I wanted to have checked.
Some of these turned out to be [compiler](https://github.com/rust-lang/rust/issues/43457) [bugs](https://github.com/rust-lang/rust/issues/43481), but others are actually examples of code violating the proposed rules.
In the following, I describe the patterns that I found.
-I assume that you are familiar with Types as Contracts as introduced in my [previous post]({{ site.baseurl }}{% post_url 2017-07-17-types-as-contracts %}).
+I assume that you are familiar with Types as Contracts as introduced in my [previous post]({% post_url 2017-07-17-types-as-contracts %}).
### 1.1 Ownership Passing Via Raw Pointers
categories: internship rust
---
-Yesterday was the last day of my [Rust internship at Mozilla]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}).
+Yesterday was the last day of my [Rust internship at Mozilla]({% post_url 2017-05-23-internship-starting %}).
I can hardly believe that twelve weeks have passed since that post, but my calendar insists.
-If you want to see what I have done, you can go read the [posts]({{ site.baseurl }}{% post_url 2017-05-23-internship-starting %}) [that]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}) I [wrote]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}) [during]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}) [the]({{ site.baseurl }}{% post_url 2017-07-17-types-as-contracts %}) [internship]({{ site.baseurl }}{% post_url 2017-08-11-types-as-contracts-evaluation %}).
+If you want to see what I have done, you can go read the [posts]({% post_url 2017-05-23-internship-starting %}) [that]({% post_url 2017-06-06-MIR-semantics %}) I [wrote]({% post_url 2017-07-08-rustbelt %}) [during]({% post_url 2017-07-14-undefined-behavior %}) [the]({% post_url 2017-07-17-types-as-contracts %}) [internship]({% post_url 2017-08-11-types-as-contracts-evaluation %}).
(I *almost* kept up with the plan of bi-weekly blog posts...)
You can also watch the 10-minute video version of this that I made for the internship presentation.
The recording is available on [Air Mozilla](https://air.mozilla.org/intern-presentations-round-2-thursday-august-1st/); my talk starts at 2h 14:30.
categories: research
---
-[POPL 2018](https://popl18.sigplan.org/home), the conference where I will present our [RustBelt paper]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}), is doing a series of interviews with senior and junior people from the community: [People of Programming Languages](https://www.cs.cmu.edu/~popl-interviews/).
+[POPL 2018](https://popl18.sigplan.org/home), the conference where I will present our [RustBelt paper]({% post_url 2017-07-08-rustbelt %}), is doing a series of interviews with senior and junior people from the community: [People of Programming Languages](https://www.cs.cmu.edu/~popl-interviews/).
They also asked me if I wanted to be part of this, and of course I accepted.
This was the first time I was asked to give an interview, so needless to say, I was super excited!
I got the opportunity to talk about how I came to do research in PL, what my research is about, and ramble a bit about why I think Coq is great.
forum: https://internals.rust-lang.org/t/sharing-for-a-lifetime/6675
---
-This post is about an aspect of the [RustBelt paper]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}).
+This post is about an aspect of the [RustBelt paper]({% post_url 2017-07-08-rustbelt %}).
Concretely, I'd like to share some of our thoughts on the nature of types and shared references.
Let's see how well this goes. :)
-Shared references are an extremely powerful mechanism in the Rust type system, and we've had quite some trouble finding a good way of handling them in our [formal model]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}).
+Shared references are an extremely powerful mechanism in the Rust type system, and we've had quite some trouble finding a good way of handling them in our [formal model]({% post_url 2015-10-12-formalizing-rust %}).
In this post, I will present the model that we came up with.
<!-- MORE -->
Let's get started by discussing what a "model" of a type looks like, and what's so hard about handling shared references.
### What's in a Type?
-I've already [written]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}) [about]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}) the idea of "semantic types", which essentially boils down to the concept that types can impose *additional invariants* on their fields, like `self.cap >= self.len` (for `Vec`) or `!is_zero(self.0)` (for `NonZero`).
+I've already [written]({% post_url 2015-10-12-formalizing-rust %}) [about]({% post_url 2016-01-09-the-scope-of-unsafe %}) the idea of "semantic types", which essentially boils down to the concept that types can impose *additional invariants* on their fields, like `self.cap >= self.len` (for `Vec`) or `!is_zero(self.0)` (for `NonZero`).
So, whenever I say "(semantic) type", I really mean "a type's (syntactic) definition (aka the data layout) **and** the additional invariants that the module imposes on this type".
Two types may be syntactically equal but have different invariants, and then clearly casting from one to the other would be horribly unsafe -- so, these would be completely different types despite their syntactic similarity.
What really matters for a type is the answer to the question: *Given some sequence of bytes, is this sequence a valid inhabitant of the type?*[^1]
## Rust types: Recap
-I have discussed my model of Rust types [in a previous post]({{ site.baseurl }}{% post_url 2018-01-31-sharing-for-a-lifetime %}); this may me a good time to read that post as I will be using it as a starting point.
+I have discussed my model of Rust types [in a previous post]({% post_url 2018-01-31-sharing-for-a-lifetime %}); this may me a good time to read that post as I will be using it as a starting point.
The short version is that I view Rust types with private invariants as not having just a single invariant, but different invariants that reflect the different "modes" the type can be in.
@cramertj suggested to use "typestate" as terminology here, and I agree that this makes sense.
`Pin` lets us give a type to `SelfReferantial` that makes it safe to use.
This is in the best tradition of Rust: We are using an expressive type system to provide safe APIs for operations that only have unsafe APIs in other languages (e.g., iterators that avoid iterator invalidation which plague even memory safe languages like Java).
In the following, I will explain how one can prove that our claim of safe encapsulation actually holds true.
-This is building on the framework that we developed for the [RustBelt paper]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}).
+This is building on the framework that we developed for the [RustBelt paper]({% post_url 2017-07-08-rustbelt %}).
## Formal Notation
forum: https://internals.rust-lang.org/t/safe-intrusive-collections-with-pinning/7281
---
-In my [last post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}), I talked about the new ["pinned references"](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md) which guarantee that the data at the memory it points to will not, ever, be moved elsewhere.
+In my [last post]({% post_url 2018-04-05-a-formal-look-at-pinning %}), I talked about the new ["pinned references"](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md) which guarantee that the data at the memory it points to will not, ever, be moved elsewhere.
I explained how they enable giving a safe API to code that could previously only be exposed with `unsafe`, and how one could go about proving such a thing.
This post is about another application of pinned references---another API whose safety relies on the pinning guarantees: Intrusive collections.
It turns out that pinned references can *almost* be used for this, but not quite.
Then we `drop` the entry while the collection still exists, and we can see it has vanished from the collection as well.
Notice that using `Pin` in the `insert` method above is crucial: If the collection of the entry were to move around, their respective pointers would get stale!
-This is fundamentally the same problem as [`SelfReferential` in my previous post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}), and `Pin` helps.
+This is fundamentally the same problem as [`SelfReferential` in my previous post]({% post_url 2018-04-05-a-formal-look-at-pinning %}), and `Pin` helps.
Thanks to this guarantee, and unlike in the intrusive-collections crate, `insert` can be called with entries *that do not outlive the collection*.
With an [API for stack pinning](https://github.com/rust-lang/rfcs/blob/master/text/2349-pin.md#stack-pinning-api-potential-future-extension), we could even have put the `entry` in `main` on the stack.
## 2 The Formal Perspective
-In this second part of the post, we are going to try and apply the formal methodology from the [previous post]({{ site.baseurl }}{% post_url 2018-04-05-a-formal-look-at-pinning %}) to the intrusive collection example above.
+In this second part of the post, we are going to try and apply the formal methodology from the [previous post]({% post_url 2018-04-05-a-formal-look-at-pinning %}) to the intrusive collection example above.
I am going to assume that you have read that post.
### 2.1 The Intrusive Collection Invariant
We are going to obtain an SSL certificate for `$HOST` *for your laptop*, and use that to secure the connection to `https://$HOST`.
Because only the laptop has the key to this certificate, the server at `$IP` cannot actually decipher the connection, it just forwards the encrypted bytes to the laptop where they are decrypted.
The easiest way to obtain such a certificate is using [Let's Encrypt](https://letsencrypt.org/).
-I am using my own [Let's Encrypt Tiny]({{ site.baseurl }}{% post_url 2017-12-26-lets-encrypt %}) for this purpose, but you can use any other Let's Encrypt client as well.
+I am using my own [Let's Encrypt Tiny]({% post_url 2017-12-26-lets-encrypt %}) for this purpose, but you can use any other Let's Encrypt client as well.
Since `$HOST:80` legitimately *is* your laptop at this point, the laptop should be able to obtain a certificate just fine.
If you are using Radicale like me, just putting Radicale on port 80 is not going to work though as that provides no way to serve the ACME challenge file needed for Let's Encrypt.
categories: internship rust
---
-After my [internship last year has ended]({{ site.baseurl }}{% post_url 2017-08-12-internship-ending %}), I naturally became somewhat less active in the Rust community as I could not work on Rust full-time any more.
+After my [internship last year has ended]({% post_url 2017-08-12-internship-ending %}), I naturally became somewhat less active in the Rust community as I could not work on Rust full-time any more.
Well, for the following months I am going to be back full-time. :)
Thanks to @aturon, I am able to work as a research assistant for Mozilla during this summer (until the end of November).
I don't really know what a "research assistant" is, but I am going to continue the work on Rust memory models, and hopefully also have some time to make progress on `union` semantics.
<!-- MORE -->
-After exploring [a "validity"-based model]({{ site.baseurl }}{% post_url 2017-08-11-types-as-contracts-evaluation %}) last year, I am going to be looking at an "alias"-based model this year.
+After exploring [a "validity"-based model]({% post_url 2017-08-11-types-as-contracts-evaluation %}) last year, I am going to be looking at an "alias"-based model this year.
That's the kind of model @arielb1, @ubsan and others have been proposing, and I am going to build on top of their work and hopefully come up with something we can actually implement a checker for
-(staying true to the [vision I laid out previously]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}) that we should have an executable operational semantics for MIR, including its [undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %})).
+(staying true to the [vision I laid out previously]({% post_url 2017-06-06-MIR-semantics %}) that we should have an executable operational semantics for MIR, including its [undefined behavior]({% post_url 2017-07-14-undefined-behavior %})).
Expect a blog post soon for what I mean by "validity"-based vs. "alias"-based, and for a first draft of such an "alias"-based model.
Until then, stay tuned!
reddit: /rust/comments/8ykuhv/the_tale_of_a_bug_in_arc_synchronization_and_data/
---
-While I was busy doing Rust-unrelated research, [RustBelt](https://plv.mpi-sws.org/rustbelt/) continues to move and recently found another bug (after [a missing `impl !Sync` that we found previously]({{ site.baseurl }}{% post_url 2017-06-09-mutexguard-sync %})): It turns out that `Arc::get_mut` did [not perform sufficient synchronization](https://github.com/rust-lang/rust/issues/51780), leading to a data race.
+While I was busy doing Rust-unrelated research, [RustBelt](https://plv.mpi-sws.org/rustbelt/) continues to move and recently found another bug (after [a missing `impl !Sync` that we found previously]({% post_url 2017-06-09-mutexguard-sync %})): It turns out that `Arc::get_mut` did [not perform sufficient synchronization](https://github.com/rust-lang/rust/issues/51780), leading to a data race.
Notice that I am just the messenger here, the bug was actually found by [Hai](https://people.mpi-sws.org/~haidang/) and [Jacques-Henri](https://jhjourdan.mketjh.fr/).
Still, I'd like to use this opportunity to talk a bit about weak memory, synchronization and data races.
[^1]: "Mostly", you may wonder? Unfortunately it turns out that there is [one `Relaxed` access in `make_mut`](https://github.com/rust-lang/rust/blob/c0955a34bcb17f0b31d7b86522a520ebe7fa93ac/src/liballoc/sync.rs#L793) that Hai and Jacques-Henri have not yet been able to prove correct. However, this is likely fixable by making the entire proof more complicated. The version where that `Relaxed` is also replaced by `Acquire` *has* been proven correct, albeit still against a model of relaxed memory accesses that is not quite as weak as C11.
One last thing:
-I have previously claimed that our [first RustBelt paper]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}) already verified correctness of `Arc`, how can there still be bugs?
+I have previously claimed that our [first RustBelt paper]({% post_url 2017-07-08-rustbelt %}) already verified correctness of `Arc`, how can there still be bugs?
In that paper, we did not (yet) have the tools to reason realistically about these ordering issues we have been discussing here, so instead we worked with a "sequentially consistent" logic which essentially assumes the strongest possible mode, `SeqCst`, for all atomic accesses.
(We did not discuss `SeqCst` above, and in fact there are not many cases where it is really needed. [`Release` and `Acquire` are enough most of the time](https://internals.rust-lang.org/t/sync-once-per-instance/7918/8?u=ralfjung).)
This is one of the simplifications we made compared to real Rust to make the verification feasible.
-We were realistic enough to find [another bug]({{ site.baseurl }}{% post_url 2017-06-09-mutexguard-sync %}), but not realistic enough for this one.
+We were realistic enough to find [another bug]({% post_url 2017-06-09-mutexguard-sync %}), but not realistic enough for this one.
Hai and Jacques-Henri are currently working on remedying this particular simplification by extending the first RustBelt paper to also cover weak memory, and that's when they ran into this problem.
**Update:** Turns out Servo has a [copy of `Arc`](https://doc.servo.org/servo_arc/index.html) that [has the same problem](https://github.com/servo/servo/issues/21186). So we got two bugs for the price of one. :) **/Update**
forum: https://internals.rust-lang.org/t/pointers-are-complicated-or-whats-in-a-byte/8045
---
-This summer, I am again [working on Rust full-time]({{ site.baseurl }}{% post_url 2018-07-11-research-assistant %}), and again I will work (amongst other things) on a "memory model" for Rust/MIR.
+This summer, I am again [working on Rust full-time]({% post_url 2018-07-11-research-assistant %}), and again I will work (amongst other things) on a "memory model" for Rust/MIR.
However, before I can talk about the ideas I have for this year, I have to finally take the time and dispel the myth that "pointers are simple: they are just integers".
Both parts of this statement are false, at least in languages with unsafe features like Rust or C: Pointers are neither simple nor (just) integers.
Since `&x[i]` is the same as `x+i`, this means we are actually writing `23` to `&y[0]`.
Of course, that does not stop C++ compilers from doing these optimizations.
-To allow this, the standard declares our code to have [undefined behavior]({{ site.baseurl }}{% post_url 2017-07-14-undefined-behavior %}).
+To allow this, the standard declares our code to have [undefined behavior]({% post_url 2017-07-14-undefined-behavior %}).
First of all, it is not allowed to perform pointer arithmetic (like `&x[i]` does) that goes [beyond either end of the array it started in](https://timsong-cpp.github.io/cppwp/n4140/expr.add#5).
Our program violates this rule: `x[i]` is outside of `x`, so this is undefined behavior.
If we wrote the above programs in assembly, there would be no UB, and no optimizations.
C++ and Rust employ a more "high-level" view of memory and pointers, restricting the programmer for the benefit of optimizations.
When formally describing what the programmer may and may not do in these languages, as we have seen, the model of pointers as integers falls apart, so we have to look for something else.
-This is another example of using a "virtual machine" that's different from the real machine for specification purposes, which is an idea [I have blogged about before]({{ site.baseurl }}{% post_url 2017-06-06-MIR-semantics %}).
+This is another example of using a "virtual machine" that's different from the real machine for specification purposes, which is an idea [I have blogged about before]({% post_url 2017-06-06-MIR-semantics %}).
-Here's a simple proposal (in fact, this is the model of pointers used in [CompCert](https://hal.inria.fr/hal-00703441/document) and my [RustBelt work]({{ site.baseurl }}{% post_url 2017-07-08-rustbelt %}), and it is also how [miri](https://github.com/solson/miri/) implements pointers):
+Here's a simple proposal (in fact, this is the model of pointers used in [CompCert](https://hal.inria.fr/hal-00703441/document) and my [RustBelt work]({% post_url 2017-07-08-rustbelt %}), and it is also how [miri](https://github.com/solson/miri/) implements pointers):
A pointer is a pair of some kind of ID uniquely identifying the *allocation*, and an *offset* into the allocation.
Adding/subtracting an integer to/from a pointer just acts on the offset, and can thus never leave the allocation.
Subtracting a pointer from another is only allowed when both point to the same allocation (matching [C++](https://timsong-cpp.github.io/cppwp/n4140/expr.add#6)).[^2]