X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/7f79b2443f56a844eb41561dbed9fa3991a5b5b7..f5649511ac4cb0b606060b4fa28cc0f174dd8d89:/ralf/_drafts/unused-data.md diff --git a/ralf/_drafts/unused-data.md b/ralf/_drafts/unused-data.md deleted file mode 100644 index 2c06b68..0000000 --- a/ralf/_drafts/unused-data.md +++ /dev/null @@ -1,61 +0,0 @@ ---- -title: "Why even unused data needs to be valid" -categories: rust ---- - -`unsafe` code in Rust has a few ways that it can trigger [Undefined Behavior][ub], i.e., there are a few assumptions that the compiler makes about all code, and that for `unsafe` code the programmer is responsible for upholding. -Those assumptions are [listed in the Rust reference](https://doc.rust-lang.org/reference/behavior-considered-undefined.html). -The one that seems to be most surprising to many people is the clause which says that unsafe code may not produce "[...] an invalid value, even in private fields and locals". -The reference goes on to explain that "*producing* a value happens any time a value is assigned to or read from a place, passed to a function/primitive operation or returned from a function/primitive operation". -In other words, even just *constructing*, for example, an invalid `bool`, is Undefined Behavior---no matter whether that `bool` is ever actually "used" by the program. -The purpose of this post is to explain why. - -[ub]: https://rust-lang.github.io/unsafe-code-guidelines/glossary.html#undefined-behavior - - - -First of all, let me clarify what is meant by "used" here, as that term is used to mean very different things. -The following code "uses" `b`: - -{% highlight rust %} -fn example(b: bool) -> i32 { - if b { 42 } else { 23 } -} -{% endhighlight %} - -I hope it is not very surprising that calling `example` on, e.g., `3` transmuted to `bool` is Undefined Behavior (UB). -When compiling `if`, the compiler assumes that `0` and `1` are the only possible values; there is no saying what could go wrong when that assumption is violated. - -What is less obvious is why calling `example` on `3` is UB even when there is no such `if` being executed. -To understand why that is important, let us consider the following example: - -{% highlight rust %} -fn example(b: bool, num: u32) -> i32 { - let mut acc = 0; - for _i in 0..num { - acc += if b { 42 } else { 23 }; - } - acc -} -{% endhighlight %} - -Now assume we were working in a slightly different version of Rust, where transmuting `3` to a `bool` is fine as long as you do not "use" the `bool`. -That would mean that calling `example(transmute(3u8), 0)` is actually allowed, because in that case the loop never gets executed, so we never "use" `b`. - -However, this is a problem for a very important transformation called [loop-invariant code motion](https://en.wikipedia.org/wiki/Loop-invariant_code_motion). -That transformation can be used to turn our `example` function into the following: - -{% highlight rust %} -fn example(b: bool, num: u32) -> i32 { - let mut acc = 0; - let incr = if b { 42 } else { 23 } - for _i in 0..num { - acc += incr; - } - acc -} -{% endhighlight %} - -The increment `if b { 42 } else { 23 }` is "invariant" during the execution of the loop, and thus computing the increment can be moved out. -Why is this a good transformation? -Instead of determining the increment each time around the loop, we do that just once, thus saving a lot of conditional jumps that the CPU is unhappy about.