X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/d4d922768b9995886fa4da911eb90decdd759d88..ba2d965151a9e8312f4538380392e5b82e52fffd:/ralf/_posts/2016-01-09-the-scope-of-unsafe.md diff --git a/ralf/_posts/2016-01-09-the-scope-of-unsafe.md b/ralf/_posts/2016-01-09-the-scope-of-unsafe.md index 6c69d09..403d6b3 100644 --- a/ralf/_posts/2016-01-09-the-scope-of-unsafe.md +++ b/ralf/_posts/2016-01-09-the-scope-of-unsafe.md @@ -1,6 +1,7 @@ --- title: The Scope of Unsafe categories: research rust +reddit: /rust/comments/4065l2/the_scope_of_unsafe/ --- I'd like to talk about an important aspect of dealing with unsafe code, that still regularly seems to catch people on the wrong foot: @@ -35,7 +36,7 @@ Roughly speaking, `ptr` points to the heap-allocated block of memory holding the It is very easy to add a function to `Vec` that contains no `unsafe` code, and still breaks the safety of the data structure: {% highlight rust %} impl Vec { - fn evil(&mut self) { + pub fn evil(&mut self) { self.len += 2; } } @@ -59,7 +60,7 @@ More precisely speaking, `ptr` points to an array of type `T` and size `cap`, of The function `evil` above violates this invariant, while all the functions actually provided by `Vec` (including the ones that are implemented unsafely) preserve the invariant. That's why `evil` is the bad guy. (The name kind of gave it away, didn't it?) -This may seem obvious in hindsight, but I think it is actually fairly subtle. +This may seem obvious in hindsight (and it is also [discussed in the Rustonomicon](https://doc.rust-lang.org/nightly/nomicon/working-with-unsafe.html)), but I think it is actually fairly subtle. There used to be claims on the interwebs that "if a Rust program crashes, the bug must be in some `unsafe` block". (And there probably still are.) Even academic researchers working on Rust got this wrong, arguing that in order to detect bugs in data structures like `Vec` it suffices to check functions involving unsafe code. That's why I think it's worth dedicating an entire blog post to this point. @@ -82,7 +83,7 @@ pub struct MyType { We will define only one function for this type: {% highlight rust %} impl MyType { - fn evil(&mut self) { + pub fn evil(&mut self) { self.len += 2; } } @@ -129,9 +130,10 @@ Or, to put it slightly differently: If the scope of `unsafe` grows beyond the sy Does it sprawl through all our code, silently infecting everything we write -- or is there some limit to its effect? As you probably imagined, of course there *is* a limit. Rust would not be a useful language otherwise. -The scope of `unsafe` ends at the next *abstraction boundary*. -This means that everything outside of the `std::vec` module does not have to worry about `Vec`. -Due to the privacy rules enforced by the compiler, code outside of that module cannot access the private fields of `Vec`, and hence it cannot tell the difference between the syntactic appearance of `Vec` and its actual, semantic meaning. +*If* all your additional invariants are about *private* fields of your data structure, then the scope of `unsafe` ends at the next *abstraction boundary*. +This means that everything outside of the `std::vec` module does not have to worry about `Vec`. +Due to the privacy rules enforced by the compiler, code outside of that module cannot access the private fields of `Vec`. +That code does not have a chance to violate the additional invariants of `Vec` -- it cannot tell the difference between the syntactic appearance of `Vec` and its actual, semantic meaning. Of course, this also means that *everything* inside `std::vec` is potentially dangerous and needs to be proven to respect the semantics of `Vec`. ## Abstraction Safety @@ -142,7 +144,8 @@ This nicely brings us to another important point, which I can only glimpse at he If the type system of Rust lacked a mechanism to establish abstraction (i.e., if there were no private fields), type safety would not be affected. However, it would be very dangerous to write a type like `Vec` that has a semantic meaning beyond its syntactic appearance. -Since users of `Vec` can accidentally perform invalid operations, there would actually be *no bound to the scope of `unsafe`*. +All code could perform invalid operations like `Vec::evil`, operations that rely on the assumption that `Vec` is just like `MyType`. +There would actually be *no bound to the scope of `unsafe`*. To formally establish safety, one would have to literally go over the entire program and prove that it doesn't misuse `Vec`. The safety promise of Rust would be pretty much useless.