The US government's view on memory safe code is not the whole story
There’s a lot of noise about how “Joe Biden’s government wants you to move away from using C++”. This is as a result of the US government releasing a report, Back to the building blocks: A path toward secure and measurable software which suggests:
“Memory safety vulnerabilities are a class of vulnerability affecting how memory can be accessed, written, allocated, or deallocated in unintended ways. Experts have identified a few programming languages that both lack traits associated with memory safety and also have high proliferation across critical systems, such as C and C++. Choosing to use memory safe programming languages at the outset, as recommended by the Cybersecurity and Infrastructure Security Agency’s (CISA) Open-Source Software Security Roadmap is one example of developing software in a secure-bydesign manner.”
The technical press seems to have focused on the fact that “you shouldn’t use C and C++ anymore” and instead you should be using Rust, Go, C# or something else. This is likely a good idea, where it works, and many of the places that I’ve worked in over the years’ moved away from C and C++ to Java and C# back in the 90s. If nothing else, it’s often easier and faster to produce software this way. The problem is that often that’s all that happens, and some people think that’s enough. It’s not.
It’s worth reading the report, and the roadmap, they’re not that long, and it’s sensible stuff that covers far more than just stopping writing code in C++. What these people are saying is important and phrased in a way that shows they understand. The idea of Cybersecurity Quality Metrics for software components is interesting, though likely a long way off. It hints at the problems of knowing the provenance of all the components that you use and how each one could be an attack vector. This is one of the things that bugs me a little about the Rust ecosystem. It’s SO easy to just pull in a crate to do “the thing you need” that you soon find that even the smallest Rust program pulls in tons of extra code that you will never so much as look at; let alone audit. All of this stuff comes from places that are, by and large, insecure. How long is it before a “bad actor” causes chaos by adjusting a core dependency? I’m not blaming Rust here, it’s just that modern package management is often just a black box of shortcuts; code that you don’t need to write or think about, but that ends up in your system.
Formal proofs tested by the compiler sound wonderful, as does new hardware that protects us from memory bugs, but we have some of this already, and it isn’t a panacea.
It’s a pity that it gives such a short shrift to testing. I would personally increase the importance placed on unit testing at various sizes of “unit”. Again, it’s not a panacea, but it’s something that should be stressed. The problem is that it’s harder, and less glamorous, than a move to a new “safe” language.
But back to memory safe languages… Much like garbage collection, I have my doubts about memory safe languages. Yes, it helps. No, it’s not enough. The problem is that, at least with the popular ones, there’s always the option for just a bit of “unsafe” code when you really need it…
This is mentioned in the Cybersecurity and Infrastructure Security Agency’s (CISA) Open-Source Software Security Roadmaproadmap
“Even with a memory safe language, memory management is not entirely memory safe. Most memory safe languages recognize that software sometimes needs to perform an unsafe memory management function to accomplish certain tasks. As a result, classes or functions are available that are recognized as non-memory safe and allow the programmer to perform a potentially unsafe memory management task. Some languages require anything memory unsafe to be explicitly annotated as such to make the programmer and any reviewers of the program aware that it is unsafe. Memory safe languages can also use libraries written in non-memory safe languages and thus can contain unsafe memory functionality. Although these ways of including memory unsafe mechanisms subvert the inherent memory safety, they help to localize where memory problems could exist, allowing for extra scrutiny on those sections of code.”
Of course, you can ban unsafe code in these languages, but this requires more knowledge from the people in charge and often that’s where the real problem lies anyway. So, more important than “just” switching to a memory-safe language, you also need:
- a desire to produce secure software
- control over when it’s OK to use “unsafe” code
- code reviews
- unit testing
- adequate training and skill levels
- reasonable timescales and expectations (so that corners don’t need to be cut)
And finally, accountability all the way up the management chain.
If you get all of these except memory-safe languages, then the need for a memory-safe language is lessened. If all you get is a memory-safe language then, IMHO, you’re not really much better off.
This is all especially pertinent for me today as I’ve just located a classic buffer overrun bug in a client’s code, written in a memory-safe language, but using an ‘unsafe’ construct. This bug has been present for a little over a year. It was always a very intermittent thing, recently manifesting as crashes during managed garbage collection. The memory-safe language was never a suspect. We have a system where networking is done by native C++ code which hosts the memory-safe execution environment, and so it was always the C++ code where the bug had to be…