Archive for December, 2007

C as a target

Saturday, December 1st, 2007

Much of the code I write must be in C. I’ve decided C needs a metalanguage. The preprocessor just doesn’t cut it. I want a full language. I don’t care if the C preprocessor is Turing complete; it simply doesn’t suit my needs.

Lately, this has resulted in me spending time developing a code generator to spit out C based on some domain specific description. The generator has turned out to work surprisingly better than I expected. It shields me from the task of writing the many safety checks required when doing C buffer arithmetic. It also writes the code necessary to sanitize the parameters and adds any extra safety checks I’ve decided are necessary for certain types of data. Apart from carrying the burden of writing the tedious and error prone safety code, the generator abstracts certain implementation decisions I’ve had to make along the way. Since the generator is designed to make these decisions easy to swap, a comparison between methods or, more likely, a customer’s last minute change becomes easy. Additionally, have the current generator can generate code for both C and Python. Having Python code that accomplishes the same task can be used to verify some other parts of our system. Finally, pre-computing pieces of data (used to boost performance) is safe in this scheme — no need to worry about recomputing that table, length, or offset every time x or anything x depends on changes. This kind of thing is by no means a new idea. Domain specific languages began receiving attention in the late 90s. I am just now starting to see their benefit.

Much of the design of my generator (for lack of a better word) is inspired by the code I’ve been writing for Simon Peyton Jones’ Implementing Functional Languages: A Tutorial. The book has been great. It’s the most Haskell I’ve written — I’m really enjoying it. The combination of the iterative design and the exercises with expanding scope make it easy to learn. Even the model for the pretty printer was useful for the C code generation. The patterns (I hate that word in this context) used are easy to recognize and I had no problem implementing tweaked versions in Python.

Last thing, the map/reduce combination is useful. It made much of my Python easy to write and understand.

Notes for next time: why doesn’t he like a direct tree grammar?

Remember to: Add references to all the ML compiled to C papers I’ve been reading.