X-Git-Url: https://git.ralfj.de/web.git/blobdiff_plain/41115b0943afc6292b7fe0e205f79ec97e95b479..7799968779ac037b9200b42448ea81bbfa8974fa:/personal/_posts/2026-03-13-inline-asm.md?ds=sidebyside diff --git a/personal/_posts/2026-03-13-inline-asm.md b/personal/_posts/2026-03-13-inline-asm.md index f577a1d..09ffd36 100644 --- a/personal/_posts/2026-03-13-inline-asm.md +++ b/personal/_posts/2026-03-13-inline-asm.md @@ -204,7 +204,7 @@ Therefore, one can write a Rust program that seems to be data-race-free (accordi In other words, rule 3 (the inline asm block must refine the story code) is violated. The principled fix for this is to extend the C++ memory model (which is shared by Rust) with a notion of non-temporal stores so that one can reason about how they interact with everything else that can go on in a concurrent program. -I am not aware of anyone having actually done this---extending or even just slightly adjusting the C++ memory model is an [enormous undertaking](https://plv.mpi-sws.org/scfix/paper.pdf). +This is [possible](https://people.mpi-sws.org/~viktor/papers/oopsla2024-inline.pdf), but it requires re-proving compiler correctness results, and at least the approach taken in that paper is architecture-specific which does not scale to the large number of architectures Rust supports. However, there is a simpler alternative: we can try coming up with a more complicated story such that rule 3 is not violated. This is exactly what a bunch of folks did when the issue around non-temporal stores was discovered. The story says that doing a non-temporal store corresponds to *spawning a thread* that will asynchronously perform the actual store,