From 463d6fdbce98018490edd126a79d60137f29b7db Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Fri, 1 Jul 2016 14:18:43 +0200 Subject: [PATCH 01/16] it's ICFP 2016 --- research/publications.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/research/publications.html b/research/publications.html index 0140db2..e8cb324 100644 --- a/research/publications.html +++ b/research/publications.html @@ -8,7 +8,7 @@ slug: Publications -- 2.30.2 From 1b292c9b8cc968d424cb9b50b1e58510b36975ed Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 11 Oct 2016 18:44:42 +0200 Subject: [PATCH 02/16] papers update: add ICFP talk; add refinement paper --- research/publications.html | 13 ++++++++++--- 1 file changed, 10 insertions(+), 3 deletions(-) diff --git a/research/publications.html b/research/publications.html index e8cb324..53de065 100644 --- a/research/publications.html +++ b/research/publications.html @@ -5,11 +5,18 @@ slug: Publications

2016

+ +

2015

@@ -18,14 +25,14 @@ slug: Publications Unifying Worlds and Resources
Ralf Jung, Derek Dreyer
In HOPE 2015: 4th ACM SIGPLAN Workshop on Higher-Order Programming with Effects
- [talk] [talk (slides)] + [talk (YouTube)] [talk (slides)]

2013

-- 2.30.2 From 2f390db778cc8d05beb9c0562a14962238bb5ba9 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 11 Oct 2016 18:46:14 +0200 Subject: [PATCH 03/16] we dont actually have an appendix for the refinement paper --- research/publications.html | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/research/publications.html b/research/publications.html index 53de065..6ac54fa 100644 --- a/research/publications.html +++ b/research/publications.html @@ -9,7 +9,7 @@ slug: Publications A Higher-Order Logic for Concurrent Termination-Preserving Refinement
Joseph Tassarotti, Ralf Jung, Robert Harper
Draft under submission
- [paper] [paper website (incl. appendix and Coq formalization)] + [paper] [paper website (incl. Coq formalization)]
  • -- 2.30.2 From de05330b63481d7edebb5338bf98760c19b9b339 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Mon, 19 Dec 2016 16:29:02 +0100 Subject: [PATCH 04/16] update papers --- research/publications.html | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/research/publications.html b/research/publications.html index 6ac54fa..ea3a8b6 100644 --- a/research/publications.html +++ b/research/publications.html @@ -3,20 +3,29 @@ title: Thesis, Publications, Workshop Submissions slug: Publications --- -

    2016

    +

    2017

    + +
    • + The Essence of Higher-Order Concurrent Separation Logic
      + Robbert Krebbers, Ralf Jung, Aleš Bizjak, Jacques-Henri Jourdan, Derek Dreyer, Lars Birkedal
      + Draft under submission
      + [paper] [project website (incl. appendix and Coq formalization)] +
    • A Higher-Order Logic for Concurrent Termination-Preserving Refinement
      Joseph Tassarotti, Ralf Jung, Robert Harper
      Draft under submission
      - [paper] [paper website (incl. Coq formalization)] + [paper] [paper website (incl. Coq formalization)]
    +

    2016

    +

    2015

    -- 2.30.2 From f194a3027a535695c5981a18c8930ad825415750 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 24 Jan 2017 10:56:22 +0100 Subject: [PATCH 05/16] ESOP papers are final --- research/publications.html | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/research/publications.html b/research/publications.html index ea3a8b6..1e6887d 100644 --- a/research/publications.html +++ b/research/publications.html @@ -8,15 +8,15 @@ slug: Publications
    • The Essence of Higher-Order Concurrent Separation Logic
      Robbert Krebbers, Ralf Jung, Aleš Bizjak, Jacques-Henri Jourdan, Derek Dreyer, Lars Birkedal
      - Draft under submission
      - [paper] [project website (incl. appendix and Coq formalization)] + In ESOP 2017: 26th European Symposium on Programming
      + [paper] [project website (incl. appendix and Coq formalization)]
    • A Higher-Order Logic for Concurrent Termination-Preserving Refinement
      Joseph Tassarotti, Ralf Jung, Robert Harper
      - Draft under submission
      - [paper] [paper website (incl. Coq formalization)] + In ESOP 2017: 26th European Symposium on Programming
      + [paper] [paper website (incl. Coq formalization)]

    2016

    -- 2.30.2 From 7f73a40904783c3c8b3f79447cdf1f8785c24c6e Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 24 Jan 2017 10:59:22 +0100 Subject: [PATCH 06/16] link directly to iris-project.org --- research/publications.html | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/research/publications.html b/research/publications.html index 1e6887d..a3b8395 100644 --- a/research/publications.html +++ b/research/publications.html @@ -25,7 +25,7 @@ slug: Publications Higher-Order Ghost State
    Ralf Jung, Robbert Krebbers, Lars Birkedal, Derek Dreyer
    In ICFP 2016: 21st ACM SIGPLAN International Conference on Functional Programming
    - [paper] [project website (incl. appendix and Coq formalization)] [talk (YouTube)] [talk (slides)] + [paper] [project website (incl. appendix and Coq formalization)] [talk (YouTube)] [talk (slides)]

2015

@@ -41,7 +41,7 @@ slug: Publications Iris: Monoids and Invariants as an Orthogonal Basis for Concurrent Reasoning
Ralf Jung, David Swasey, Filip Sieczkowski, Kasper Svendsen, Aaron Turon, Lars Birkedal, Derek Dreyer
In POPL 2015: 42nd ACM SIGPLAN-SIGACT Symposium on Principles of Programming Languages
- [paper] [project website (incl. appendix and Coq formalization)] [talk (slides)] + [paper] [project website (incl. appendix and Coq formalization)] [talk (slides)]

2013

-- 2.30.2 From b8503445011b3426167a0ce4628de2398ab5e088 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 23 May 2017 13:58:28 -0700 Subject: [PATCH 07/16] add post on Paris meetup that I forgot to add --- .../_posts/2017-01-20-paris-rust-meetup.md | 18 ++++++++++++++++++ 1 file changed, 18 insertions(+) create mode 100644 personal/_posts/2017-01-20-paris-rust-meetup.md diff --git a/personal/_posts/2017-01-20-paris-rust-meetup.md b/personal/_posts/2017-01-20-paris-rust-meetup.md new file mode 100644 index 0000000..8b6800f --- /dev/null +++ b/personal/_posts/2017-01-20-paris-rust-meetup.md @@ -0,0 +1,18 @@ +--- +title: Talk @ Paris Rust Meetup +categories: research rust +reddit: /rust/comments/5p714x/rust_paris_meetup_talks_on_rust_formalization/ +--- + +This week, I have been at the Paris Rust Meetup. Meeting all sorts of Rust people was great fun, and the Mozilla offices in Paris are absolutely impressive. +You should totally check them out if you have a chance. + +On that meetup, I gave a short talk about the current status of my [formalization of the Rust type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}). + +That was rather spontaneously arranged just two days before the meetup, but it was definitely worth it. +If you are interested in what this work is all about and where we are currently, you can go watch the recording of the talk at [Air Mozilla](https://air.mozilla.org/rust-paris-meetup-35-2017-01-19/). +My talk covers the first 20 minutes of that recording, but you should just keep on watching as I was followed by Niko and Aaron talking about [Rayon](https://github.com/nikomatsakis/rayon) and [Tokio](https://tokio.rs/). + +I am aware I have been rather silent here over the last year. +I do have some topics in the pipeline that I'd like to write more extensive posts on (some of which I hinted at during the talk). +Hopefully I can find some time to write them Soon (TM). -- 2.30.2 From 9fd69675c3e14edd5da5f905feddf5dd53297607 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 23 May 2017 13:58:46 -0700 Subject: [PATCH 08/16] day 1 of my internship --- .../_posts/2017-05-23-internship-starting.md | 38 +++++++++++++++++++ 1 file changed, 38 insertions(+) create mode 100644 personal/_posts/2017-05-23-internship-starting.md diff --git a/personal/_posts/2017-05-23-internship-starting.md b/personal/_posts/2017-05-23-internship-starting.md new file mode 100644 index 0000000..1b7269b --- /dev/null +++ b/personal/_posts/2017-05-23-internship-starting.md @@ -0,0 +1,38 @@ +--- +title: Day 1 of my Mozilla Internship +categories: internship rust +--- + +This summer, I am given the awesome opportunity of spending three months in the Mozilla offices in Portland, working on Rust. +I am extremely grateful that Mozilla is providing this opportunity to me; hopefully I can give something back by making some hands-on contributions to the Rust ecosystem. + + +Today is day 1 of my internship -- at least if you start counting at 0, which of course we all do. +I meant to write this yesterday, but the morning was filled with bureaucratics and in the afternoon I got so jetlagged, I essentially was sleeping with my eyes open. +(But don't tell Aaron, my manager, or Niko, my mentor -- I was of course supposed to be working. ;) + +So, what concretely will I be doing in the next three months? +Based on [my prior posts]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}) and me being in the [unsafe code guidelines strike team](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) (which I totally forgot to announce here... I'm not good at this blogging thing, am I?), it should not be surprising that my main project is going to be related to unsafe code. +More specifically, we want to figure out ways to specify what unsafe code is and what it is not allowed to do (i.e., when its behavior is well-defined, and when it is undefined). +Ultimately, the way I think this should be specified (and lucky enough, my new bosses agree on this) is by defining, for every possible Rust (or rather, MIR) program one could write, what that program is going to do when it is executed. + +Now, one could go all committee and initiate a huge process involving drafting a standard and writing a [700 page document](http://www.open-std.org/jtc1/sc22/wg14/www/docs/n1570.pdf) in English prose, but experience shows that this is not a good way to go: English prose is ambiguous, and it is very easy to forget about some odd combination of corner cases which then ends up not being described by the standard. +One could also go all math and write down a beautifully complicated definition that rigorously defines everything. +That would be huge amounts of work, and in the end, virtually nobody would be able to read such a specification. +(Which doesn't mean I think this shouldn't also happen; such a definition is indeed needed to perform formal proofs -- but at this point, we are not yet concerned about proofs.) + +We are going to do neither of these. +Instead, the goal is to have an *executable specification* for MIR -- in other words, an interpreter. +To understand an interpreter, one has to just read code, so this should be much more accessible to programmers than mathematical definitions. +At the same time, if we forget to cover a case, this will be blatantly obvious -- in fact, most of the time, exhaustiveness checking will catch this. +And finally, an interpreter can be used for testing: We could actually check whether some particular unsafe code has undefined behavior or not. +This will be very important while we are still developing the specification (to check whether our intuition for what should and should not be allowed matches the interpreter), but it is also extremely useful later for authors of unsafe code to check whether they are violating the rules we are going to put in place. + +Lucky enough, such an interpreter already exists: [miri](https://github.com/solson/miri)! +That's great, because it means I do not have to write an interpreter from scratch, defining what a stack is and how to perform integers operations and whatnot. +Instead, I can concentrate on the interesting questions coming up in the unsafe code guidelines: +What exactly do the guarantees "mutable borrows don't have aliases" and "the pointees of shared borrows are not mutated" *mean*? How should they be reflected in a semantics of MIR -- in miri -- such that the desired [optimizations](https://github.com/nikomatsakis/rust-memory-model/tree/master/optimizations) are actually legal, while at the same time the [unsafe code](https://doc.rust-lang.org/nomicon/) people write has the desired behavior? + +That's it for now. +I don't have the answers to these questions, but hopefully my work will help getting closer to an answer. +I will keep you posted on my progress (or lack thereof), probably on a weekly or bi-weekly basis. -- 2.30.2 From a7be6d6b7e841ec4ea671fc38fb5902e7b7cfbea Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 6 Jun 2017 18:24:54 -0700 Subject: [PATCH 09/16] Capitalize Things --- personal/_posts/2015-10-12-formalizing-rust.md | 8 ++++---- personal/_posts/2017-05-23-internship-starting.md | 2 +- 2 files changed, 5 insertions(+), 5 deletions(-) diff --git a/personal/_posts/2015-10-12-formalizing-rust.md b/personal/_posts/2015-10-12-formalizing-rust.md index 9f5dc7a..b867ffb 100644 --- a/personal/_posts/2015-10-12-formalizing-rust.md +++ b/personal/_posts/2015-10-12-formalizing-rust.md @@ -37,13 +37,13 @@ The first version is also not going to cover automatic destructors, i.e., the `D This is, of course, a significant gap, even more so since dropck may well be the most subtle part of the type system, and the current version (as of Rust 1.3) is [known to be unsound](https://github.com/rust-lang/rust/issues/26656). But we have to start somewhere, and even without `Drop`, there is a lot to be done. -## So what's left? +## So What's Left? You may wonder now, what are we even doing then? The goal is to prove that programs following Rust's rules of ownership, borrowing and lifetimes are *memory and thread safe*, which, roughly speaking, means that all executed pointer accesses are valid, and there are no data races. I will often just say "memory safety", but concurrency is definitely part of the story. -## The syntactic approach and its limitations +## The Syntactic Approach and Its Limitations We could now be looking at the checks the Rust compiler is doing, and prove directly that these checks imply memory safety. Let us call these checks Rust's type system, then we are looking for the classic result of *type soundness*: Well-typed programs don't go wrong. @@ -54,7 +54,7 @@ In the following, we will consider programs containing `unsafe` as not being wel (We could also consider them well-typed in a more liberal type system that above result does not, and cannot, apply to. This is closer to what the Rust compiler does, but it's not helpful in our context.) Note that this issue is viral: `Vec`, `RefCell`, `Rc` -- all these and many more standard library data structures use `unsafe`, any if your program uses any of them, you're out of luck. -## Semantics to the rescue +## Semantics to the Rescue Intuitively, why do we even think that a program that uses `Vec`, but contains no `unsafe` block *itself*, should be safe? It's not just the compiler checks making sure that, e.g., we don't call `push` on a shared borrow. @@ -76,7 +76,7 @@ This is in contrast to the *syntactic* notion of well-typedness of a function, w Only recently, research has been able to scale up such semantic methods to languages that combine state (i.e., mutable variables, a heap) with higher-order functions -- languages like Rust, where I can take a closure (a `Fn{,Mut,Once}`) and store it in memory, for later use. Lucky enough, my advisor [Derek Dreyer](http://www.mpi-sws.org/~dreyer/) is one of the world experts in this field, which permits me to intensively study Rust (which I would have done anyways) and call it research! -## What we are doing +## What We Are Doing So, that's what we are trying to do: Come up not only with a formal version of Rust's (syntactic) type system, but also with appropriate *semantics* for these types. We then have to show that the syntactic types make semantic sense. This recovers, in a very roundabout way, the result I mentioned above: diff --git a/personal/_posts/2017-05-23-internship-starting.md b/personal/_posts/2017-05-23-internship-starting.md index 1b7269b..54361c2 100644 --- a/personal/_posts/2017-05-23-internship-starting.md +++ b/personal/_posts/2017-05-23-internship-starting.md @@ -1,5 +1,5 @@ --- -title: Day 1 of my Mozilla Internship +title: Day 1 of My Mozilla Internship categories: internship rust --- -- 2.30.2 From 4041df2de22d545046c3e8343df59924a68fc269 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 6 Jun 2017 19:21:44 -0700 Subject: [PATCH 10/16] blog post about MIR and miri and semantics and instrumentation --- personal/_posts/2017-06-06-MIR-semantics.md | 177 ++++++++++++++++++++ 1 file changed, 177 insertions(+) create mode 100644 personal/_posts/2017-06-06-MIR-semantics.md diff --git a/personal/_posts/2017-06-06-MIR-semantics.md b/personal/_posts/2017-06-06-MIR-semantics.md new file mode 100644 index 0000000..e092120 --- /dev/null +++ b/personal/_posts/2017-06-06-MIR-semantics.md @@ -0,0 +1,177 @@ +--- +title: Exploring MIR Semantics through miri +categories: internship rust +--- + +It's now been two weeks since my internship started (two weeks already, can you believe it?). +In other words, if I want to post "weekly or bi-weekly" updates, I better write one today ;) . + +As [already mentioned]({{ site.baseurl }}{% post_url +2017-05-23-internship-starting %}), the goal for this internship is to +experiment with +[unsafe code guidelines](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) +by implementing them in [miri](https://github.com/solson/miri). Before I tackle that, however, +it seemed sensible for me to grab some low-hanging fruit in miri just to make +myself familiar with the codebase. It turns out I entered "unsafe code +guidelines" territory much quicker than expected. +This post is about what I found, and it also serves as a nice demonstration of how we envision my unsafe code guidelines workflow to look like. + + +## Live and Let Die + +The first miri task that I took on was for miri to do something sensible with MIR's [`StroageLive` and `StroageDead` statements](https://github.com/solson/miri/issues/49). +If you haven't seen these before, that's because they do not come up in Rust code; instead, `StroageLive` and `StroageDead` are inserted by the compiler. +For example, the following Rust code +{% highlight rust %} +fn main() { + let x = 13; + let y = 3*x + 3; +} +{% endhighlight %} +compiles to the following MIR (in release mode)[^1]: + +[^1]: You can run `rustc -O --emit=mir` to reproduce this. `-O` turns on release mode, which turns off overflow checking, which makes the MIR code significantly shorter. Alternatively, you can compile Rust code to MIR on the [playground](https://play.rust-lang.org/). + +``` + bb0: { + StorageLive(_1); + _1 = const 13i32; // let x = 13; + StorageLive(_2); + StorageLive(_3); + StorageLive(_4); + _4 = _1; // let tmp_4 = x; + _3 = Mul(const 3i32, _4); // let tmp_3 = 3 * tmp_4; + StorageDead(_4); + _2 = Add(_3, const 3i32); // let y = tmp_3 + x; + StorageDead(_3); + _0 = (); + StorageDead(_2); + StorageDead(_1); + return; + } +``` +That's quite verbose, but you can still see our little test program in there: There's a `const 13i32`, there's a `Mul` by `3i32`, and there's an `Add` with `3i32`. +The surrounding `bb0` hints at the fact that MIR is a language built around [basic blocks](https://en.wikipedia.org/wiki/Basic_block). +That's not really relevant right now, so we'll gloss over this. + +What *is* relevant is that the MIR code contains all these extra statements we did not write, all these `StorageLive(...)` and `StorageDead(...)`. +They indicate when a variable is *live*, i.e., when the value of a variable is relevant for the program execution. +For example, the compiler marks `_4` as dead right after the multiplication. +`_4` is not used again by the program, so there is no point in keeping its value around. + +Liveness information is important because it is forwarded to LLVM, which then uses this information to allocate variables into stack slots. +If two variables are never live at the same time, the compiler can assign both of them the *same* slot, which is great because it makes the stack frame smaller. +To this end, rustc will translate `StorageLive` to [`llvm.lifetime.start`](http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic) and `StroageDead` to [`llvm.lifetime.end`](http://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic). +These are intrinsics that LLVM understands and takes into account when computing the layout of the stack frame. + +## Giving It Semantics + +Now, the trouble is: How do we know if rustc made a mistake when it added the liveness information? +That would be bad, because it could lead to miscompilation. +However, to figure out whether the information is "wrong", we first have to find a definition of what it means for the information to be "right". +You may think that's obvious: Just make sure that every access to a variable is after the `StorageLive`, and before the `StorageDead`, done -- right? +Unfortunately, it's not always so easy. What if we say `StorageLive` twice on the same variable, is that allowed? What if there is a `StorageLive` in a loop? + +One way to give a definite answer to all such questions is to give these annotations an actual *behavior*, i.e., to give them *semantics*. +When the program above runs, we actually imagine that every local variable comes with an additional bit storing whether the variable is currently live, and `StorageLive` and `StorageDead` modify said bit. +When a variable is accessed, we can check that bit to make sure the variable is currently live. +And we can actually *implement* this behavior, so that questions like the ones I raised above can be answered by just executing programs that do these things, and see what happens. +So that's exactly what I did: I [extended miri](https://github.com/solson/miri/pull/176) to no longer just ignore `StorageLive` and `StorageDead`, but instead keep track of liveness of all local variables. + +It is in the implementation of `StorageLive` that the aforementioned questions come up: When we see `StorageLive` on a variable that *already is live*, what do we do? +We have three options (well, I am sure we have more, but these are the three I considered): +1. Do nothing. The variable is already live, it is marked live again, so we are good. +2. Bail out. Marking something as live *twice* is bogus, MIR programs should not do that. +3. Forget the value that is currently in the local variable, but otherwise keep it live. + +## LLVM Chiming In + +I started out with the first option. +However, it turns out that's a bad choice. +The trouble is, we are not entirely unconstrained in what we allow and forbid. +After all, `StorageLive` and `StorageDead` are compiled into LLVM intrinsics, and LLVM has some rules for what these intrinsics mean and when they are allowed to be used -- and when not. +Here's what the [LLVM specification of `llvm.lifetime.start`](http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic) has to say: + +> This intrinsic indicates that before this point in the code, the value of the memory pointed to by `ptr` is dead. This means that it is known to never be used and has an undefined value. A load from the pointer that precedes this intrinsic can be replaced with `undef`. + +Unfortunately, this is rather short and does not cover lots of interesting questions. +For example: Is it allowed to mark a variable as live, and then dead, and then live again? +The text somewhat reads like it is not, but that would make it illegal to use these intrinsics in a loop, which is probably not the intended meaning. +Such imprecision in natural language specifications is the entire reason why we also want to have an *executable* specification of Rust. +(This is not to say that natural language specifications are useless. Ideally, you have them both.) + +Anyway, what does seem clear from this specification is that adding a redundant `StorageLive` at the wrong place can have bad side-effects, like make preceding loads yield `undef`. +We'd rather not have that happen in code produced by rustc. +To guarantee this doesn't happen, we should define the semantics of these MIR statements such that they forbid redundant `StorageLive`. +As a consequence of this, we can be sure that if we start with a "valid" MIR program -- a program that does not execute forbidden instructions, like a redundant `StorageLive` -- then the resulting LLVM will also be "valid". + +Executing such bad instructions is often referred to as the program having "undefined behavior". +The reason for this is that we just don't want to talk about the possible behaviors of such programs, we just give the compiler the license to do whatever. +For this reason, it is crucially important that the programs we care about do *not* have undefined behavior (or, for short: UB). + +How is making `StorageLive` UB helpful, you may wonder? Didn't we just shift the problem from rustc producing invalid LLVM to rustc producing invalid MIR? +In a sense, that's true, but I would still argue this is helpful. +First of all, it is now clear where the bug (if there is one) has to be fixed: +The pass that lowers HIR to MIR should make sure that it emits liveness annotations that match MIR's rules. +All the passes that transform MIR (e.g. to optimize it) have a clear idea of what they are working against. +They can all *rely* on the MIR program they get to not to have UB, and they have to *guarantee* that they do not change the program in a way that it now does have UB. +Code generation from MIR to LLVM is right when it emits the aforementioned intrinsics, because all it relies on is the MIR program not having UB. + +And, secondly, we actually now have a way to test *if* the compiler emits incorrect liveness annotations for a give piece of code: +We implement the MIR rules in miri, and then we run miri on a body of representative MIR code produced by the latest compiler to see whether UB arises or not. +miri's test suite is tiny compared to the body of Rust code out there, but it exercises pretty much all syntactic features of core Rust and some standard library code (mostly around `Vec` as well as all the code needed to even get a program running and to tear it down again -- there's quite a few things going on before and after your `fn main()`, but that would be a topic for another blog post). +It's a good enough start. + +So, I implemented the second of the three behaviors of `StorageLive` that I outlined above: +Marking something as live *twice* is bogus, MIR programs should not do that. +The reasoning here is to play safe; this is as conservative as it gets. +If a program is fine under these rules, we can be fairly certain that LLVM also considers this program to be fine. + +Of course, the inevitable happened and... + +## rustc Disagrees, A Compromise Arises + +That's right, there was a test failure. +It turns out rustc *does* produce code that violates the rules we picked.[^2] +So, did we find a bug? + +[^2]: In case you are curious, it was [this test](https://github.com/solson/miri/blob/ec7f1d5248fe1cee904a6d8af094167ded779781/tests/run-pass/loops.rs) that failed. Somewhat unsurprisingly, loops are involved. + +Well, maybe. +Or maybe the rules we picked were just too conservative. +At this point, I ended up in a lengthy discussion with @eddyb and @arielb1, who both know approximately infinitely more about LLVM and rustc than I do, and this is how the third option in the list arose: +When performing `StorageLive` on a variable that already is live, forget the value that is currently in the local variable, but otherwise keep it live. +This is consistent with what we have caught LLVM doing. +It is hard to get any more definite than this. +The "real" semantics of the relevant LLVM intrinsics is implicitly hidden between the lines of the LLVM code that uses these intrinsics to perform analyses. +Often, not even the LLVM developers will be sure what is allowed and what is not. (In fact, this is exactly what happened here.) +This is not very satisfying, but lacking a more precise description of the LLVM semantics, this is all we can do. + +The good news is that with this choice of MIR semantics, miri's test suite passes. +We can thus be sure (well, insofar as the test suite is representative -- this will hopefully get better over time) that rustc produces code that follows our new rules. + +## And the moral of this story + +So, what did we learn? + +First of all, we learned that it is frustrating not to know whether a certain concrete program, with given concrete input, is actually triggering undefined behavior or not. +This kind of uncertainty is even more worrying if we consider that tons and tons of code out there is compiled by LLVM -- most of it is probably fine, but still, I think we should be able to be somewhat more certain about whether the programs we write and run are actually doing what we think they do. +(Because, remember -- if a program has UB, the compiler has license to do literally anything to your code, and most modern compilers are making use of this possibility without a second thought. To be fair, they kind of have to, because all the other compilers do, and this is how they make programs go fast and win benchmarks, and this is what everybody cares about when it comes to compilers. Sad world.) +In Rust, we want to do better than this, and that's in fact the entire point of the [unsafe code guidelines](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) project. +If we can realize the Rust specification in miri's code (besides having a natural language description), that should get us a long way towards reducing ambiguity and gaps in our specification. +For the near future, LLVM and its issues (and speed!) will still remain the pipeline, but [who knows what happens long-term](https://github.com/stoklund/cretonne)? + +Secondly, you have seen an example of how an "instrumented machine" looks like. +I didn't actually call it that way, but notice how miri now keeps track of the liveness of a local variable. +That's not something that the final program does. +And that's fine, because entirely ignoring all liveness information just means that programs that used to be UB, are now fully defined -- which is always a legal thing to do. (You may however lose some optimization potential on the way.) +However, miri has the goal of detecting *all* liveness-related UB, and to this end, it "instruments" the program (or the execution, or the machine -- however you look at it) with additional information, additional state. +This is a very common approach. +It is used, for example, by valgrind to tell you when your program's memory accesses are invalid, but it is also used in formal verification of programs (in this domain, we often refer to the additional state of the instrumented machine as "logical state" or "ghost state"). + +And finally, the workflow above is actually somewhat representative of how my contribution to the unsafe code guidelines within the next few months will hopefully look like: +The plan is to add additional instructions to MIR that express certain facts the compiler believes to be true (just like the liveness annotations do), and to give these instructions a semantics in miri (this will involve adding state that the actual program does not have, i.e., adding more instrumentation). +There will be a tension between the semantics being "conservative enough" such that LLVM's assumptions are satisfied (for example, some of this will relate to the [`noalias`](http://llvm.org/docs/LangRef.html#noalias-and-alias-scope-metadata) flag that rustc emits), and such that rustc can perform some clever optimizations we really want it to perform, while at the same time the semantics should be "permissive enough" such that the code all the Rust developers out there write, and the MIR that rustc generates from that code, is actually permitted. +There will be back and forth between various alternatives, and in the end, hopefully there will be a compromise that everybody can accept. + +Happy safe hacking! -- 2.30.2 From cce059c41f3db0d8a1d551cde49f500344682bae Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 6 Jun 2017 20:42:06 -0700 Subject: [PATCH 11/16] add link to reddit --- personal/_posts/2017-06-06-MIR-semantics.md | 1 + 1 file changed, 1 insertion(+) diff --git a/personal/_posts/2017-06-06-MIR-semantics.md b/personal/_posts/2017-06-06-MIR-semantics.md index e092120..13f51db 100644 --- a/personal/_posts/2017-06-06-MIR-semantics.md +++ b/personal/_posts/2017-06-06-MIR-semantics.md @@ -1,6 +1,7 @@ --- title: Exploring MIR Semantics through miri categories: internship rust +reddit: /rust/comments/6fqzub/exploring_mir_semantics_through_miri/ --- It's now been two weeks since my internship started (two weeks already, can you believe it?). -- 2.30.2 From 290953129a0985db864021b1bad4a43bd8ba1284 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Tue, 6 Jun 2017 21:17:09 -0700 Subject: [PATCH 12/16] fix some typos --- personal/_posts/2017-06-06-MIR-semantics.md | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/personal/_posts/2017-06-06-MIR-semantics.md b/personal/_posts/2017-06-06-MIR-semantics.md index 13f51db..6ddc745 100644 --- a/personal/_posts/2017-06-06-MIR-semantics.md +++ b/personal/_posts/2017-06-06-MIR-semantics.md @@ -20,8 +20,8 @@ This post is about what I found, and it also serves as a nice demonstration of h ## Live and Let Die -The first miri task that I took on was for miri to do something sensible with MIR's [`StroageLive` and `StroageDead` statements](https://github.com/solson/miri/issues/49). -If you haven't seen these before, that's because they do not come up in Rust code; instead, `StroageLive` and `StroageDead` are inserted by the compiler. +The first miri task that I took on was for miri to do something sensible with MIR's [`StorageLive` and `StorageDead` statements](https://github.com/solson/miri/issues/49). +If you haven't seen these before, that's because they do not come up in Rust code; instead, `StorageLive` and `StorageDead` are inserted by the compiler. For example, the following Rust code {% highlight rust %} fn main() { @@ -62,7 +62,7 @@ For example, the compiler marks `_4` as dead right after the multiplication. Liveness information is important because it is forwarded to LLVM, which then uses this information to allocate variables into stack slots. If two variables are never live at the same time, the compiler can assign both of them the *same* slot, which is great because it makes the stack frame smaller. -To this end, rustc will translate `StorageLive` to [`llvm.lifetime.start`](http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic) and `StroageDead` to [`llvm.lifetime.end`](http://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic). +To this end, rustc will translate `StorageLive` to [`llvm.lifetime.start`](http://llvm.org/docs/LangRef.html#llvm-lifetime-start-intrinsic) and `StorageDead` to [`llvm.lifetime.end`](http://llvm.org/docs/LangRef.html#llvm-lifetime-end-intrinsic). These are intrinsics that LLVM understands and takes into account when computing the layout of the stack frame. ## Giving It Semantics -- 2.30.2 From f5c3b86e8bba2645a7f55c0413bdba86c2d65177 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Wed, 7 Jun 2017 09:51:48 -0700 Subject: [PATCH 13/16] #rustc was also involved --- personal/_posts/2017-06-06-MIR-semantics.md | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/personal/_posts/2017-06-06-MIR-semantics.md b/personal/_posts/2017-06-06-MIR-semantics.md index 6ddc745..f631e7a 100644 --- a/personal/_posts/2017-06-06-MIR-semantics.md +++ b/personal/_posts/2017-06-06-MIR-semantics.md @@ -140,7 +140,7 @@ So, did we find a bug? Well, maybe. Or maybe the rules we picked were just too conservative. -At this point, I ended up in a lengthy discussion with @eddyb and @arielb1, who both know approximately infinitely more about LLVM and rustc than I do, and this is how the third option in the list arose: +At this point, I ended up in a lengthy discussion with @eddyb and @arielb1 and some folks in #rustc, who know approximately infinitely more about LLVM and rustc than I do, and this is how the third option in the list arose: When performing `StorageLive` on a variable that already is live, forget the value that is currently in the local variable, but otherwise keep it live. This is consistent with what we have caught LLVM doing. It is hard to get any more definite than this. @@ -151,7 +151,7 @@ This is not very satisfying, but lacking a more precise description of the LLVM The good news is that with this choice of MIR semantics, miri's test suite passes. We can thus be sure (well, insofar as the test suite is representative -- this will hopefully get better over time) that rustc produces code that follows our new rules. -## And the moral of this story +## And the Moral of This Story So, what did we learn? -- 2.30.2 From fa00d89780c6a7e2add6a860b05fd3a0997b3e77 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Wed, 7 Jun 2017 13:32:34 -0700 Subject: [PATCH 14/16] expand on the MIR-spec part --- personal/_posts/2017-05-23-internship-starting.md | 15 +++++++++++++++ 1 file changed, 15 insertions(+) diff --git a/personal/_posts/2017-05-23-internship-starting.md b/personal/_posts/2017-05-23-internship-starting.md index 54361c2..7a4eea0 100644 --- a/personal/_posts/2017-05-23-internship-starting.md +++ b/personal/_posts/2017-05-23-internship-starting.md @@ -36,3 +36,18 @@ What exactly do the guarantees "mutable borrows don't have aliases" and "the poi That's it for now. I don't have the answers to these questions, but hopefully my work will help getting closer to an answer. I will keep you posted on my progress (or lack thereof), probably on a weekly or bi-weekly basis. + +**Update.** I realized I should probably expand on the parenthetical remark about specifying MIR rather than specifying Rust. +What we are planning to do here is to specify Rust by specifying (a) how Rust code translates to MIR, and (b) specifying MIR. +This has two advantages. +First of all, part (a) actually is already done and implemented in the Rust compiler! +We *have* a complete and non-ambiguous definition of how Rust code is to be translated to MIR code. +(Currently, this definition exists only in code; ideally, one day, it will also exist in natural language as part of a specification document.) +As a second advantage, MIR is *much* simpler than Rust: It has way fewer operations, no syntactic sugar, less redundancy. +(In fact, that's the entire reason why MIR was introduced into the compiler.) +Specifying the behavior of a MIR program can concentrate on the parts that really matter, without focusing on details like how exactly a `for`-loop desugars into lower-level operations. + +Now, this is *not* to say that every Rust compiler has to use MIR in its compilation pipeline. +It just means that the way I imagine a specification of Rust to look like is as consisting of two parts: The Rust-to-MIR translation, and a specification for MIR. +If another compiler uses a different implementation strategy, it can still be compliant with the specification; it just has to ensure that Rust programs behave as specified. +This is a common approach that can also be found, e.g., in the specification of CPU instruction sets: The specification describes the behavior of a complex instruction as a series of commands in some lower-level language. The CPU does not actually use that language as part of its implementation, but *it behaves as if it would*, and that's the only part that matters. -- 2.30.2 From 3ca8cb9f45c400a037a36edc1eeeb0cadfe32825 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Fri, 9 Jun 2017 12:44:10 -0700 Subject: [PATCH 15/16] More Capitalization --- personal/_posts/2016-01-09-the-scope-of-unsafe.md | 6 +++--- personal/_posts/2017-05-23-internship-starting.md | 2 +- 2 files changed, 4 insertions(+), 4 deletions(-) diff --git a/personal/_posts/2016-01-09-the-scope-of-unsafe.md b/personal/_posts/2016-01-09-the-scope-of-unsafe.md index 757bd73..59a318b 100644 --- a/personal/_posts/2016-01-09-the-scope-of-unsafe.md +++ b/personal/_posts/2016-01-09-the-scope-of-unsafe.md @@ -51,7 +51,7 @@ Oops! So, this example clearly shows that to evaluate the safety of types like `Vec`, we have to look at *every single function* provided by that data structure, even if it does not contain any `unsafe` code. -## The reason why +## The Reason Why Why is it the case that a safe function can break `Vec`? How can we even say that it is the safe function which is at fault, rather than some piece of `unsafe` code elsewhere in `Vec`? @@ -74,7 +74,7 @@ Even academic researchers working on Rust got this wrong, arguing that in order That's why I think it's worth dedicating an entire blog post to this point. But we are not done yet, we can actually use this observation to learn more about types and type systems. -## The semantic perspective +## The Semantic Perspective There is another way to phrase the intuition of types having additional invariants: @@ -129,7 +129,7 @@ This is just as bad as a function assigning `*f = 42u32` to a `f: &mut f32`. The difference between `Vec` and `MyType` is no less significant than the difference between `f32` and `u32`. It's just that the compiler has been specifically taught about `f32`, while it doesn't know enough about the semantics of `Vec`. -## The actual scope of unsafe +## The Actual Scope of Unsafe At this point, you may be slightly worried about the safety of the Rust ecosystem. I just spent two sections arguing that the Rust compiler actually doesn't know what it is doing when it comes to checking functions that work on `Vec`. diff --git a/personal/_posts/2017-05-23-internship-starting.md b/personal/_posts/2017-05-23-internship-starting.md index 7a4eea0..c8a7252 100644 --- a/personal/_posts/2017-05-23-internship-starting.md +++ b/personal/_posts/2017-05-23-internship-starting.md @@ -1,5 +1,5 @@ --- -title: Day 1 of My Mozilla Internship +title: "Day 1 of My Mozilla Internship, or: How to Specify Program Behavior?" categories: internship rust --- -- 2.30.2 From 2898555a14938b304571b1b717dae95541972f51 Mon Sep 17 00:00:00 2001 From: Ralf Jung Date: Fri, 9 Jun 2017 13:00:26 -0700 Subject: [PATCH 16/16] Post on the MutexGuard-Sync-bug --- personal/_posts/2017-06-09-mutexguard-sync.md | 130 ++++++++++++++++++ 1 file changed, 130 insertions(+) create mode 100644 personal/_posts/2017-06-09-mutexguard-sync.md diff --git a/personal/_posts/2017-06-09-mutexguard-sync.md b/personal/_posts/2017-06-09-mutexguard-sync.md new file mode 100644 index 0000000..a01980f --- /dev/null +++ b/personal/_posts/2017-06-09-mutexguard-sync.md @@ -0,0 +1,130 @@ +--- +title: How MutexGuard was Sync When It Should Not Have Been +categories: rust +--- + +A couple of weeks ago, our ongoing effort to [formalize Rust's type system]({{ site.baseurl }}{% post_url 2015-10-12-formalizing-rust %}) lead to us actually discovering a bug in the Rust standard library: +`MutexGuard` implemented `Sync` in cases where it should not. This could lead to data races in safe programs. Ouch. + + +I have to admit that I was somewhat hoping that this would be the case, after all, our paper will "sell" much better if we can point to an actual bug that was found through our effort ;) . +I was, however, not giving this a high chance; after all, many eyes have looked at the standard library. +Well, it turned out the trouble was not some code that someone had written and others had overlooked, the trouble was some code that had *not* been written... but let me start in the beginning. + +## The Bug + +Here's a piece of code demonstrating the [problem](https://github.com/rust-lang/rust/issues/41622): +{% highlight rust %} +use std::sync::Mutex; +use std::cell::Cell; + +extern crate rayon; + +fn main() +{ + let m = Mutex::new(Cell::new(0)); + let g : MutexGuard> = m.lock().unwrap(); + { + rayon::join( + || { g.set(g.get() + 1); println!("Thread 1: {:?}", g.get()) }, + || { g.set(g.get() + 1); println!("Thread 2: {:?}", g.get()) }); + } +} +{% endhighlight %} +Let us carefully step through this code to see what is going on. +First, we create a `Mutex>`. +Then we acquire the lock on this mutex, and store the `MutexGuard>`. +This is guaranteed to succeed immediately, after all, there is no other thread that could even try to acquire this lock. +Next, we run two closures in parallel (that's what `rayon::join` does), both accessing the same `MutexGuard` `g`. +This is somewhat unusual: Typically, each thread acquires the lock for itself, performs some operation, and then unlocks the mutex. +Here, the main thread acquires the lock, and then it hands a shared reference to the guard to two other threads. +This may sound odd, but it can in fact be a perfectly safe and pretty powerful pattern *if both threads only perform read operations*. + +The trouble is, in the above example, both threads use `Cell::set` to change the content of the cell. +`Cell` uses unsychronized accesses, so this is a classical example of a data race: Two threads accessing the same location, and at least one of them writing to it. +This is exactly the kind of concurrency bug that Rust set out to prevent. +How does it come that the compiler accepted this code? + +As already mentioned, the two closures both capture a shared reference to `g`. +In other words, they both capture something of type `&MutexGuard>`. +Following the [type of `rayon::join`](https://docs.rs/rayon/0.7.1/rayon/fn.join.html), the compiler checks whether that type is `Send`. +This is exactly the check that is supposed to prevent data races. +Because `g` is a shared reference, it proceeds by checking whether `MutexGuard>` is `Sync`, and this is where things go wrong. +If we look at the [source code of `Mutex` before the fix](https://github.com/rust-lang/rust/blob/3d60bf45f42fe3ab6f6a64a983ed256d86ffdbbe/src/libstd/sync/mutex.rs), we can see that there is *no* implementation of `Sync` for `MutexGuard`. +Now, `Sync` is an "auto trait" (also often called OIBIT, but that's really not a great name). +This means that the compiler considers a type like `MutexGuard` to be `Sync` if all its fields are `Sync`. +It turns out that if you follow this through, the compiler considers `MutexGuard` to be `Sync` if `T` is `Send`. +It then checks whether `Cell` is `Send` -- which it is -- and accepts our program. + +## How the Bug Was Found + +I found this bug in the process of proving safety of `Mutex`. +In particular, this involves proving that the `unsafe impl {Send, Sync}` are actually justified. +Now, `MutexGuard` *did not have* an `unsafe impl Sync` -- but that does not mean that it doesn't implement `Sync`! +Because of the way auto traits work, it could still implement `Sync` automatically, based on the types of its fields. +The trouble is, just because the compiler implements `Sync` automatically for us (so we don't have to write an `unsafe impl`) doesn't mean that this automatic `Sync` is correct! +The compiler does not understand what `MutexGuad` actually means, which invariants it maintains, nor does it understand the precise guarantees provided by `Send` and `Sync`. +It cannot possibly check how these two interact. + +So what I did was write some little test programs to check when the compiler considers `MutexGuard` to be `Sync`, and determined that it is sufficient for `T` to be `Send`. +(It turns out that it is much easier to figure this out by experiments than by staring at the source, and it is absolutely impossible to determine this based on the documentation. I would argue that is a [problem on its own](https://github.com/rust-lang/rust/issues/41537).) +Then I looked at what I had to prove, and quickly determined that this can't possibly work -- I needed `T` to be `Sync`, not `Send`! +Given this observation, it wasn't hard to "weaponize" this bug and write the racy program that you can see above. + +## The Fix + +Clearly, to fix this problem, we have to make it so that `MutexGuard>` is *not* `Sync`. +The question is: When, in general, can we make `MutexGuard` `Sync`? +To answer this question, we have to consider which operations can be performed on `&MutexGuard`. +We can consider the type `Sync` if all these operations are thread-safe. +In this case, the only interesting operation is that we can use `Deref::deref` to obtain an `&T`. +In other words, for `&MutexGuard` to be thread-safe, we need `&T` to be thread-safe. +This is exactly what it means for `T` to be `Sync`! + +To sum this all up, we want the implementation of `Sync` for `MutexGuard` to look something like this: +{% highlight rust %} +unsafe impl Sync for MutexGuard { } +{% endhighlight %} +Notice the `unsafe` here: The compiler cannot actually check whether we made any mistake in the analysis that we performed above, so we have to tell it to trust us in our assessment. +The [actual fix](https://github.com/rust-lang/rust/pull/41624) is somewhat more verbose because `MutexGuard` supports unsized types and because there is a lifetime involved, but that doesn't really matter for our discussion here. + +## So What Now? + +This fix landed in Rust master a while ago and will be part of Rust 1.19. +If you read this post in the future and try the code above, you are likely using a sufficiently recent compiler for the code to be rejected -- which means you are safe from this particular bug! +So far, I've seen no reports of code that stopped compiling because of this fix, and the pattern involved in abusing the bug is sufficiently obscure that it seems unlikely for any code to actually be broken because of this. + +So all's well that ends well? +Not really. +Notice how the bug is not caused by an incorrect piece of code somewhere in the implementation of `Mutex`. +One could go over that entire file, and prove every single line of it correct, and all proofs would go through. +The bug is in a line of code that was *not* written. +I don't think it is very surprising that this was overlooked. + +Ideally, types like `MutexGuard` would *not* get these automatic implementations of `Send` and `Sync`. +That would make sure that, if some type incorrectly implements these traits, at least it comes from some piece of incorrect `unsafe` somewhere. +This is exactly the point of `unsafe`: Marking clearly the points in the code that have to be audited extra-carefully. +The reality is [somewhat more subtle than this]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %}), but nevertheless, `unsafe` does its job pretty well -- except for when it comes to auto traits. + +To make things worse, imagine what happens when auto traits are stabilized, so that crates can introduce their own auto traits. +Some of them may want to express additional guarantees that the compiler cannot check, so some of these auto traits may be unsafe. +It is impossible to predict how the guarantees of these unsafe auto traits interact with the invariants of types like `MutexGuard` -- but still, the compiler will consider the trait to be implemented for `MutexGuard` if it is implemented for all fields. +The types of these fields are an implementation detail and not even visible to other crates. +Nothing outside of the standard library should care about such details. +However, as we have seen, with user-defined auto traits, we actually care a lot! + +Again, I think the solution here is that types like `MutexGuard` should not automatically implement unsafe auto traits. +What do I mean by "types like `MutexGuard`"? +The point that makes `MutexGuard` special here is that there is an invariant attached with this type that all instances of `MutexGuard` satisfy. +(I describe this concept of types with additional invariants in more detail [in one of my previous blog posts]({{ site.baseurl }}{% post_url 2016-01-09-the-scope-of-unsafe %})). +My proposal is to *tell the compiler* that this is the case. +If the compiler understands that a type is "more than what it seems to be", we could know during compilation that we have to be more cautious when handling this type -- and in particular, we could have such types *not* automatically implement unsafe auto traits. +In fact, I [proposed such an annotation](https://internals.rust-lang.org/t/pre-rfc-unsafe-types/3073) previously for different reasons, [and so did others](https://github.com/rust-lang/rfcs/issues/381).[^1] + +[^1]: I think having a notion of unsafe types can also help solve some of the [unsafe code guidelines](https://internals.rust-lang.org/t/next-steps-for-unsafe-code-guidelines/3864) questions. But that is content for another blog post. + +This of course significantly weakens the power of user-defined auto traits. +However, I hope I convinced you that if we don't act, errors like the one described here are bound to happen. +That said, such decisions are of course going to go through the usual RFC process. +It's certainly possible that someone comes up with a compromise that preserves some of the usefulness of auto traits, without putting safety at risk. + -- 2.30.2