In 2002, a group of researchers presented a paper on Cyclone, a safe dialect of C [1]. While (manually) porting code from C to Cyclone, they found safety bugs in the C code.
These kinds of manual or automated conversation from C to <safer language> therefore have potential not only for increasing adoption of safer languages but also for uncovering existing bugs.
[1] https://www.researchgate.net/profile/James-Cheney-2/publicat...
I've ported some projects to Rust (including C, where I've used C2Rust as first step), and I've drawn some conclusions.
1. Converting a C program to Rust, even if it includes unsafe code, often uncovers bugs quickly thanks to Rust’s stringent constraints (bounds checking, strict signatures, etc.).
2. automated C to Rust conversion is IMO something that will never be solved entirely, because the design of C program is fundamentally different from Rust; such conversions require a significant redesign to be made safe (of course, not all C programs are the same).
3. in some cases, it’s plain impossible to port a program from C to Rust while preserving the exact semantics, because unsafety can be inherent in the design.
That said, tooling is essential to porting, and as tools continue to evolve, the process will become more streamlined.
Author here, I thought it'd be helpful to address a few of the points brought up in the various comment threads.
1. This is an academic paper that we posted on arxiv, not a release announcement for a new product where we claim we have solved C to Rust. We submitted to a PL conference, not an open-source meeting like e.g. FOSDEM -- this is not the same audience at all, and the expectations are very different.
2. Our story is simple. We start from the constraint of translating C to /safe/ Rust, and see what this entails: a small well-behaved subset of C, inference of slice splitting, a translation that may error out, and a program that may abort (plus a few other things described in the paper). We evaluate our ideas on what we have (C embedded in F*), and show that it scales decently with those constraints in mind, on a large-scale C library that is used in Firefox, Python, and many other pieces of mainstream software. We don't claim we can rewrite e.g. Firefox in Rust automatically.
3. This is how research works. We think we have an interesting point in the design space; we don't claim we solve every issue, but think this is an interesting idea that may unlock further progress in the space of C to Rust translation, and we think it's worth putting out there for others to take inspiration from. Who knows, maybe some existing tool will use this approach for parts the fit in the subset, and fall back to unsafe Rust for other parts that don't fit! This is a very active area: if we can contribute something that other tools / researchers can use, great.
4. This is not the final story, and again this is how research works. We are working on an actual C frontend via libclang, and are exploring how e.g. guarantee that the generated Rust does not generate out of bounds accesses, perhaps by emitting verification conditions to Z3 (speculating on future work here). If the reviewers think more work is needed, that's fine, and we'll resubmit with enhancements. If the reviewers think this is an active area and others could benefit from our ideas, and take the paper, even better.
The thing I wonder about is why we would do this. The technology to really convert industrial-grade apps from C to Rust could probably bullet proof the C apps more easily. They’d just have to do some analyses that fed into existing tooling, like static analyzers and test generators.
Similarly, it they might generate safe wrappers that let teams write new code in Rust side by side with the field-proven C. New code has the full benefits, old code is proven safe, and the interfaces are safer.
A full on translator might be an ideal option. We’d want one language for the codebase in the future. Push-button safety with low, false positives for existing C and C++ is still the greatest need, though. Maybe auto-correcting bad structure right in the C, too, like Google’s compiler tool and ForAllSecure’s Mayhem do.
If you used a naïve translation to Rust, wouldn’t you get parts that are safe and parts that are unsafe? So your manual job would need to be only verifying safety in the unsafe regions (same as when writing rust to begin with)?
Seems it would be a win even if the unsafe portion is quite large. Obviously not of it’s 90% of the end result.
Compiling a tiny subset of C, that is. It might be so tiny as to be useless in practice.
I have low hopes for this kind of approach; it’s sure to hit the limits of what’s possible with static analysis of C code. Also, choosing Rust as the target makes the problem unnecessarily hard because Rust’s ownership model is so foreign to how real C programs work.
I wonder how this compares to the zig-to-C translate function.
Zig seems to be awesome at creating mixed environs of zig for new code and C for old, and translating or interop, plus being a C compiler.
There must be some very good reasons why Linux kernel maintainers aren't looking to zig as a C replacement rather than Rust.
I don't know enough to even speculate so would appreciate those with more knowledge and experiencing weighing in.
Can something like `C2Rust` then use this to generate formally correct code?
Also, is much of the authors did manual or was it run through something to produce the Rust code? If so, where is the code that generates Rust, I do not see any links to any source repos.
I wonder, if a C library is working (i.e. is not formally proven to be not having problems, but works in most ways) why shouldn't we translate it using rust unsafe? I would say there is a value in it as rust lacks of libraries generally. And this would not be different from using a dll/so that was written in c and can be unsafe in some circumstances after all
Interesting how higher optimization levels didn’t really help speed up rust in the O level comparison
Interesting concept. But for a working system in C, why do we need to "convert" it to Rust. Seems like an effort where juice isn't worth the squeeze. Probably will create more problems than we're fixing.
I wonder how well O3 can do just compiling C to rust in one shot
c2rust.com, but it uses things like libc::c_int
Ugh. They didn't compile any C to Rust. They modified the F*-to-C compiler to emit Rust instead. So they compiled F* to safe Rust. And they couldn't even do that 100% reliably; some valid F* constructs couldn't be translated into Rust properly. They could either translate it into Rust code that wouldn't compile, or translate it into similar-looking Rust code that would compile, but would produce incorrect results.
Flagged, this is just a lie of a title.
Note that this is done for “existing formally verified C codebases” which is a lot different from typical systems C code which is not formally verified.