I have some very exciting news to share: the “Writing a C Compiler” series is now a book!
Writing a C Compiler: Build a Real Programming Language from Scratch is coming out from No Starch Press next January. You can preorder at the link to get early access to the first few chapters.
In the last post in the series, I said that I was going to take a six-month break to figure out how to finish the compiler. Instead, I took a three-year break, reworked the backend, implemented the rest of the features I wanted to add (well, most of them), and wrote a book. If you were already following the series, you can jump to this section to learn what’s changed. Otherwise, read on for an elevator pitch!
What’s the deal with this book?
Writing a C Compiler is a hands-on guide to, well, writing your own C compiler. It takes the same basic approach as the series of blog posts I published here a few years ago. You start out by compiling the tiniest possible C program to x64 assembly, then add a new feature in each chapter. This book is all about compiling a real, widely used programming language into real assembly code, with all the low-level details and ugly edge cases that entails.
At the same time, I wanted to write this book for a broad audience, not just people who already know assembly code or have the C standard memorized. So I’ve tried to lay the whole process–ugly edge cases included–in a way that’s accessible, easy to follow, and maybe even fun. The implementation code in the book is all pseudocode, so you can implement your compiler in whatever language you want!
- Part I introduces the basics, like expressions, variables, control flow statements, and function calls.
- Part II adds more types, including floating-point numbers, arrays and pointers, and structs.
- Part III covers a few classic optimizations, like constant folding, dead code elimination, and register allocation.
I didn’t include every feature in the C standard, but I wanted the end result to feel complete. I’ve also tried to cover the fundamentals that you’ll need to know if you want to keep building out new features on your own.
What if I’ve already done the series?
When I started working on the book, I thought that I’d just be building on the existing series. But the implementation in the book quickly diverged from what I’d originally posted. The most obvious problem is that the original design produced 32-bit x86 assembly, which was quickly becoming obsolete even when I first started the project back in 2017.
The other problem was that I needed a new intermediate representation. Converting the AST directly to assembly worked well for the first few chapters, but got more and more unwieldy as the project went on. I knew that things would only get worse as I started to add new types, and optimizations were going to be really difficult. The new implementation converts the program to three-address code before it generates assembly.
The upshot is that I won’t be continuing the series on this blog. The good news, of course, is that you can finish your compiler by working through the book, which covers a lot more ground! The bad news is that you won’t be able to skip straight to Part II; you’ll have to bring your backend in line with the implementation described in Part I first. Hopefully, the payoff of finishing your compiler will be well worth the extra work!