X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/04b77c3671c98373e8a152d9965f2c8f238cb558..a7b32383dbc01227516b29f7c7e022f067888b28:/personal/_posts/2018-07-19-const.md?ds=inline diff --git a/personal/_posts/2018-07-19-const.md b/personal/_posts/2018-07-19-const.md index d93a6cd..0a3ad30 100644 --- a/personal/_posts/2018-07-19-const.md +++ b/personal/_posts/2018-07-19-const.md @@ -16,7 +16,7 @@ Expect something like a structured brain dump, so there are some unanswered ques CTFE is the mechanism used by the compiler, primarily, to evaluate items like `const x: T = ...;`. The `...` here is going to be Rust code that must be "run" at compile-time, because it can be used as a constant in the code -- for example, it can be used for array lengths. -Notice that CTFE is *not* the same as constant propagation: Constant propagation is an optimization pass done by LLVM that will opportunistically change code like `3 + 4` into `7` to avoid run-time work. +Notice that CTFE is *not* the same as constant propagation: Constant propagation is an optimization pass done by compilers like LLVM that will opportunistically change code like `3 + 4` into `7` to avoid run-time work. Being an optimization, constant propagation must, by definition, not change program behavior and will not be observable at all (other than performance). CTFE, on the other hand, is about code that *must* be executed at compile-time because the compiler needs to know its result to proceed -- for example, it needs to know the size of an array to compute how to lay out data in memory. You can statically see, just from the syntax of the code, whether CTFE applies to some piece of code or not: @@ -33,9 +33,13 @@ We say that the `3 + 4` above is in *const context* and hence subject to CTFE, b Not all operations can be used in const context. For example, it makes no sense to compute your array length as "please go read that file from disk and compute something" -- we can't know what will be on the disk when the program actually runs. -We could use the disk of the machine compiling the program, but that does not sound very appearling either. -In fact, it would also be grossly unsafe: +We could use the disk of the machine compiling the program, but that does not sound very appealing either. +Things get even worse when you consider letting the program send information to the network. +Clearly, we don't want CTFE to have actually observable side-effects outside of compilation. + +In fact, just naively letting programs read files would also be grossly unsafe: When computing the length of an array twice, it is important that we obtain the same result. +**Update:** As @eddyb points out, things get even worse once you consider const generics, traits, and coherence: At that point, you have to [rely on evaluating the same expression in different crates to produce the same result](https://internals.rust-lang.org/t/mir-constant-evaluation/3143/47). **/Update** > *CTFE must be deterministic.* @@ -91,9 +95,9 @@ We will definitely want to allow this code. Why should `==` or `%` not be const-safe? Well, we could call our function as follows: {% highlight rust %} -is_eight_mod_256(Box::new(0).into_raw() as usize); +is_eight_mod_256(Box::into_raw(Box::new(0)) as usize); {% endhighlight %} -That statement is certainly *not* const-safe as the result depends on where exactly the allocator puts our `Box`. +That statement is certainly *not* const-safe as the result depends on where exactly the allocator puts our [`Box`](https://doc.rust-lang.org/stable/std/boxed/struct.Box.html). However, we want to blame the `as usize` for this issue, not the `is_eight_mod_256`. The solution is for the const type system to not just have separate rules about which operations are allowed, we also must change our notion of which values are "valid" for a given type. @@ -135,7 +139,8 @@ So, we will likely have to live with either considering floating point operation I think it is possible to achieve CTFE correctness for all other operations, and I think we should strive to do so. Before we go on, notice that CTFE correctness as defined above does not say anything about the case where CTFE fails with an error, e.g. because of an unsupported operation. -That is a deliberate choice because it lets us gradually improve the operations supported by CTFE, but it is a choice that not everyone might agree with. +CTFE would be trivially correct (in the above sense) if it just always immediately returned an error. +However, since const-safe programs cannot error during CTFE, we know from CTFE correctness that *those* programs *do* in fact behave exactly the same at compile-time and at run-time. ## Unsafe Blocks in Const Context @@ -185,7 +190,7 @@ As usual when writing `unsafe` code, we have to be careful not to violate the sa We have to manually ensure that, *if* our inputs are const-valid, then we will not trigger a CTFE error and return a const-valid result. For this example, the reason no CTFE error can arise is that references cannot dangle. We can thus provide `ptr_eq` as an abstraction that is entirely safe to use in const context, even though it contains a potentially const-unsafe operation. -This is, again, in perfect analogy with types like `Vec` being entirely safe to use from safe Rust even though `Vec` internally uses plenty of potentially unsafe operations. +This is, again, in perfect analogy with types like [`Vec`](https://doc.rust-lang.org/stable/std/vec/struct.Vec.html) being entirely safe to use from safe Rust even though `Vec` internally uses plenty of potentially unsafe operations. Whenever I said above that some operation must be rejected by the const type system, what that really means is that the operation should be unsafe in const context. Even pointer-to-integer casts can be used internally in const-safe code, for example to pack additional bits into the aligned part of a pointer in a perfectly deterministic way. @@ -211,7 +216,7 @@ Const soundness already says that this is a way to ensure const safety. I propose to only ever promote values that are *safely* const-well-typed. (So, we will not promote values involving const-unsafe operations even when we are in an unsafe block.) When there are function calls, the function must be a safe `const fn` and all arguments, again, const-well-typed. -For example, `&is_eight_mod_256(13)` would be promoted but `&is_eight_mod_256(Box::new(0).into_raw() as usize)` would not. +For example, `&is_eight_mod_256(13)` would be promoted but `&is_eight_mod_256(Box::into_raw(Box::new(0)) as usize)` would not. As usual for type systems, this is an entirely local analysis that does not look into other functions' bodies. Assuming our const type system is sound, the only way we could possibly have a CTFE error from promotion is when there is a safe `const fn` with an unsound `unsafe` block. @@ -228,8 +233,10 @@ I am not sure which effect that should or will have for promotion. ## Conclusion I have discussed the notions of CTFE determinism and CTFE correctness (which are properties of a CTFE engine like miri), as well as const safety (property of a piece of code) and const soundness (property of a type system). +In particular, I propose that *when type-checking safe code in const context, we guarantee that this code is const-safe*, i.e., that it will not hit a CTFE error (though panics are allowed, just like they are in "run-time" Rust code). + There are still plenty of open questions, in particular around the interaction of [`const fn` and traits](https://github.com/rust-lang/rust/issues/24111#issuecomment-311029471), but I hope this terminology is useful when having those discussions. Let the type systems guide us :) -Thanks to @oli-obk for feedback on a draft of this post. +Thanks to @oli-obk for feedback on a draft of this post, and to @centril for interesting discussion in #rust-lang that triggered me into developing these ideas and terminology. If you have feedback or questions, [let's discuss in the internals forum](https://internals.rust-lang.org/t/thoughts-on-compile-time-function-evaluation-and-type-systems/8004)!