Zig + STM32F4Discovery = Blink

Posted on Apr 13, 2021

I recently discovered the Zig programming language thanks to this post, which first caught my attention for the mechanical keyboards background. I liked the premise of Zig’s aim to be a “better C”, as opposed to other languages (e.g. Rust) that tend more towards “better C++”, so I decided to give Zig a try and use it to do something I would usually use C for: programming a microcontroller.

This post will walk through the example, discussing some of the choices I’ve made and the tools I used. You can find the final source code here, I suggest you go through it while reading the post. I won’t provide super-detailed steps to build the example from the ground up, but you can look at git log to follow all my attempts (including failures).

Project init

The post assumes you already have downloaded and installed Zig. I used the dev version of the language (specifically version 0.8.0-dev.1509+b54514d9d) but everything should work using the stable version (0.7.1).

You can create a project directory and initialize it for the generation of an executable with these commands:

mkdir zig-stm32-blink
cd zig-stm32-blink
zig init-exe

This creates the basic structure of the project, and running zig build run should greet you with:

info: All your codebase are belong to us.

The next step is setting up all the stuff we need for cross-compilation, which fortunately with Zig is not a lot.

Setting up cross-compilation

Zig’s cross-compilation experience really pleased me. You can list all supported targets with zig targets and for most targets you just have to run zig build-exe -target <target-triple> to cross-compile. Since we are building for a specific target, we can just define a fixed target in build.zig.

The STM32F4 Discovery uses an STM32F407VG, which is an ARM Cortex-M4 CPU, so cpu_arch will be arm thumb and cpu_model will target the cortex_m4 CPU. The code will run as bare metal, with no OS involved, so os_tag will be freestanding.

Update 04/06/2021: I discovered that the correct cpu_arch to use here is thumb and not arm, since Cortex-M CPUs only support Thumb (and Thumb2) instructions (see here)

The last choice is the ABI (Application Binary Interface). We’re running on bare metal with no libc, so we can choose between eabi (i.e. “soft-float”) and eabihf (i.e. “hard-float”). Since the STM32F407VG has a hardware Floating Point Unit and we will enable it in the system initialization, we will go for eabihf.

Putting this all together leads to this target definition in build.zig:

const target = .{
    .cpu_arch = .thumb,
    .cpu_model = .{ .explicit = &std.Target.arm.cpu.cortex_m4 },
    .os_tag = .freestanding,
    .abi = .eabihf,
};

Now that we can build for the target we need some more pieces to produce an executable that can be run on our bare metal board.

Linker script

The linker script is a file that contains a definition of the memories available in the device (e.g. RAM, ROM, etc) and tells the linker how to displace the object code in them. It also provides other information like the name of the entry point function and it also allows exporting some symbols marking memory sections which can be used from Zig (or C) code to perform the device initialization.

For my example, I took the linker script generated by libopencm3, which uses this generic one adding the correct memory definitions for the CPU. I then changed ENTRY to match my resetHandler name and added some other stuff to provide default exception handlers (more on that in the next section).

If you want to go deep in the linker script rabbit hole I suggest “The Most Thoroughly Commented Linker Script in The World”.

To tell Zig to use the link script while building the executable, we’ll use setLinkerScriptPath passing the linker script path in build.zig.

Vector table

The vector table is a data structure containing addresses to functions that get executed when an exception is triggered by the CPU or by an external event. Exceptions caused by external events are usually called interrupts, while exception is generally used for exceptions triggered by the CPU itself (e.g. an illegal instruction or a division by zero).

In the repo I added the bare minimum to make the CPU happy: I implemented a resetHandler, while pointing to (overridable) empty handlers for the other exceptions and ignoring all other interrupts.

It’s possible to see the structure of the STM32F407 vector table in Section 2.3.4 of the STM32 Cortex M4 Programming manual. The vector table starts at the address 0x00000000, which is why the linker script emits the .vectors section as the first section in the ROM. The entry at 0x00000000 is the initial stack pointer, and after that, there are 15 32-bit words representing pointers to the handlers (with some reserved space in between).

To implement this with Zig in vector.zig we export an array of optional function pointers with C calling convention, targeting the .vectors linksection. The stack pointer symbol is exported by the linker script and we pretend it’s a function pointer to be able to put it in the array.

All handlers are extern and the linker script exports weakly linked symbols that point to either a blockingHandler (for fault handlers) or a nullHandler (for system handlers).

Reset handler

resetHandler (which is in startup.zig) gets executed after system reset. Here we perform some initialization steps that are needed before proceeding to execute the main function.

The first step is initializing the .data section. This section contains global variables that are initialized with a specific value. For example, if in our code we have a global variable like

var the_answer = 42;

the 42 will be saved in the ROM while the space for the variable will be reserved in RAM, and our initialization code is going to be responsible for copying the initial value to the RAM to initialize the variable.

The .bss section requires a similar initialization, but it contains uninitialized (or 0 initialized) data. So in this case we don’t need to copy data from somewhere else, we just need to scan through .bss and set everything to 0.

At this point, we have the bare minimum to run arbitrary Zig code. The only thing missing is a way to access memory-mapped peripherals. We could do what is usually done in C: search for addresses in the datasheet (or use a vendor-provided set of #defines), access them as u32 values, and perform bitwise operations using flags and masks, but is there a better way?

Memory-mapped IO using packed structs

I lied in the introduction: the post that actually convinced me to try Zig on an embedded board was this other one and I recommend you check it out. I find the packed struct API extremely ergonomic to do memory-mapped IO, especially the modify function, which doesn’t require you to fiddle too much to preserve what you don’t want to change and lets you concentrate on what you do want to change (differently from usual MMIO done with bit-shifts and flags).

To generate the structs to access the registers, I used my own svd4zig tool, which started as a fork of the svd2zig tool developed by justinbalexander crossed with the register output format of the svd2zig tool developed by lynaghk, which is the one described in the post above (hence svd2zig * 2 = svd4zig).

The tool takes an svd file as input, which is an XML file describing the device peripherals, registers etc, and generates a Zig source file which allows accessing registers using packed structs. The generated code is the one contained in registers.zig.

Armed with a handy way to access registers, we can do some more system initialization and finally blink some LEDs.

System init and blinking LEDs

The main function calls the systemInit function. The comments in the function itself should be quite self-explanatory: first of all, we enable the FPU coprocessor. This is needed since we are using eabihf and must be done before executing any code which deals with floating-point numbers.

After that, the whole dance until the end of the function is there to initialize the CPU to use the external clock, reaching a clock speed of 168 MHz. This is not strictly necessary, if we skip that code the board would just run at the default speed of 16 MHz using the internal clock. If you want a handy tool to generate all the values needed to initialize all clock domains without having to calculate everything by hand, the CubeMX tool by ST has your back.

Back to main, we are finally going to blink some LEDs. We enable the clock to the GPIOD peripheral, where the LEDs are connected (on pins 12, 13, 14, and 15). Then we set the mode of those pin to “General purpose output” and we light up two of the four LEDs. From there we start an infinite while loop that just flips the LEDs on and off in a cross pattern.

Flashing the code

To flash the code, I added a custom build step which calls the st-flash tool contained in the STLink Tools provided by ST. Those tools are usually available also in your distro’s repositories.

Integrating a custom command in the Zig build process is really easy. After installing the tool, just run zig build flash. This will produce a raw binary from the ELF, which is needed by st-flash, and then it will flash it using st-flash.

Hooray, blinking LEDs!

Debugging

If you want to debug the code running on your board you can do so using openocd and gdb-multiarch. In a terminal, run:

openocd -f board/stm32f4discovery.cfg

Then from another terminal navigate to the directory containing the ELF output (i.e. zig-cache/bin) and run:

gdb-multiarch zig-stm32-blink.elf -ex "target remote :3333"

You can move around with the usual gdb commands, and you can even use the @breakpoint() builtin Zig function to manually insert a breakpoint in a specific place in the source code.

Zig or Zign’t?

Let’s start by saying that I think that Zig is a really cool language. It’s simple enough that I felt confident tackling the issues with the svd2zig tool after less than a week that I was using it. The documentation is still a little lacking, especially for the std library, but this is compensated by the fact that you can actually read the std library source code and understand what it’s doing.

The main issue I encountered during this process was the fact that packed structs are currently sometimes broken, which required some workarounds to use the struct-based MMIO in the generated code (and even then, I’m not sure svd4zig will work for all possible CPUs, please let me know if it doesn’t by opening an issue).

It would also be cool if Zig supported Xtensa (since ESP32 is my main go-to platform for embedded stuff these days). Some progress is being made and some of it depends on some pending stuff in esp-idf, so I guess it’s just a matter of time.

Overall, though, I enjoyed the experience of working with Zig on an embedded device. The build system and the cross-compilation tooling is really seamless, and I think the language strikes the right balance for this kind of device. I tried learning some Rust last year and while I appreciated some of its insights, I didn’t enjoy the experience of trying it on an embedded device as I did with Zig.

So I think I’ll keep tinkering with Zig for some other time in the future on some embedded boards. I’d like to explore the comptime stuff combined with the interrupts struct generated by zig4svd to provide a nice API to implement interrupt handlers, and maybe trying to make some sounds with it.