It's programming all the way down

The main point of programming is to produce instructions that are executed by a computer [1]. By this definition, you are programming if you are creating or changing the instructions that a computer executes.

Compile-time programming

One thing that's pretty hip these days is type systems. The main issue with them is that they can be pretty powerful. This is a testament to the fact that Turing completeness is not so difficult to realise. That's the first point.

Observe that Rust's type system does for Rust what static analysis tools like CPPCheck do for C++. More generally, type checking is a (very privileged) form of static analysis. That's the second point.

From these two points we can conclude that type checking and programming are really two sides of the same coin. Type-level programming is just programming. So what's the difference? The time when the code is executed. When type-level programming, the code is executed during compilation of the underlying program (or when running the type checker). When doing regular programming, the code is executed when it is run. Type checking and static analysis typically don't modify the code that is executed when a program is run, but compile-time programming that does is pretty common and popular. Macro systems (such as Rust's or the ones found in various Lisps) are a form of compile-time programming, as are C++-templates and the C-preprocessor.

This gives us great power because we get to modify the code as it's being compiled, but it also produces difficulties. The compile-time programming language is often a different language than the run-time programming language. Macros can be difficult to learn, and type-level programs often look like something from another world. This is why Zig's comptime keyword is so brilliant. It allows you to execute Zig code (some terms and conditions apply, naturally) while your Zig program is compiling. This neatly sidesteps the multiple language issue, while providing you with the full power that compile-time programming can bring. This also provides a hook for static analyzers to get involved with the compilation process, which I think sounds very exciting.

Run-time programming

You can do run-time programming in language that have a mechanism to dynamically generate code while the program is running, and execute it. For instance, you can do this in Python with strings and eval. This allows you to tailor the code that is being run to the data in your program. It's a bit less hip these days because it tends to lead to gigantic security holes if you don't do it very carefully.

However, this is how almost all programs get executed. If you write Java code, when you compile it gets turned into JVM bytecode. This JVM bytecode gets executed by the JVM runtime (nomen est omen). We say "executed", but what that really means is that the JVM runtime converts bytecode into actual computer instructions. That's creating instructions that a computer executes. Run-time programming! Python does something similar: when the interpreter sees code, it first gets compiled to Python Bytecode, which is then executed on a virtual machine.

A natural question to ask is if we can move run-time programming to compile-time, and we can (usually). This can be done in Python with tools like Numba or Nuitka, but is often difficult because the code produced by run-time programming can depend on user input. And compiling happens ahead of time [2], so the compiler does not have access to it.

This observation immediately leads to another: the process of someone using the program or something providing input to the program is also a form of programming! Even though the program is created in a extremely domain-specific language, hand-crafted for the task at hand, it is programming nonetheless.

Programming-time programming

The problem with both run-time programming and compile-time programming is that is that it can become very hard to keep figure out what exactly is going to happen: you never know if some library is going to hook into your code and rewrite all your classes's __init__-methods. And even if you are "only" using macros or comptime, you are prevented from straightforwardly reading the code top-to-bottom to see what is happening (a lot of Object Oriented code is like that regardless, but I digress). Aside from the human element, it also makes static analysis more difficult, meaning your development tools are able to help you less.

An alternative pattern I've taken to using (for instance in Pydantic-SQL-bridge) is to generate the code, and keep that as part of the source for the program. I use the computer to write part of the program for me, leading to what you might term "programming-time programming". To the rest of the system that ensures that my code does things (really, code is a mass of inert text that requires tremendous machinery to effect some real change into the world), it is the same as code that I wrote myself. Any and all static analysis can be brought to bear on it, it's just code.

The tradeoff is that you have to step a bit more careful around the code that you generate. [3] If you modify it, then the generating code becomes useless, and you cannot easily reproduce the generated code, should you need some change in the repetition that made you generate the code in the first place. In practice I find this a worthwhile tradeoff: I work in small teams with responsible adults who know to read the comment at the top of the file that says "GENERATED BY script_x.py DO NOT MODIFY".

Why is this relevant?

It's programming all the way down. Writing code, generating boilerplate in your IDE, modifying code at compile-time, executing the code at run-time, and using the program are all ways to create or modify instructions that the computer executes. This means you should be applying your intuitions about one of these to all the others. Some concrete advice:

That's it! I hope this helps you think about your programs in a new way, and makes you ask new questions of your development process. Here's a diagram to finish it up.

diagram of points discussed in the blog post

Footnotes

  1. If the code doesn't execute, it cannot effect the world and it is useless. Therefore the single most important property of a program is that it should execute correctly.
  2. Let's not get in the weeds with JIT compilers here
  3. This is an example of one of the two hard things in computer science, cache invalidation. We are caching the generated code by writing it to a file on disk. We could generate it on the fly or at compile-time and make sure the generated code always stays in sync with the generating process. But then we cannot easily use static analyzers on the generated code, so it's a tradeoff.
home