June 08, 2025 Author: Andrew KelleyNow, when you target x86_64, by default, Zig will use its own x86 backend rather than using LLVM to lower a bitcode file to an object file.The default is not changed…
Author: Andrew Kelley
Now, when you target x86_64, by default, Zig will use its own x86 backend rather than using LLVM to lower a bitcode file to an object file.
The default is not changed on Windows yet, because more COFF linker work needs to be done first.
The x86 backend is now passing 1987 behavior tests, versus 1980 passed by the LLVM backend. In reality there are 2084 behavior tests, but the extra ones there are generally redundant with LLVM’s own test suite for its own x86 backend, so we only run those when testing with self-hosted x86. Anyway, my point is that Zig’s x86 backend is now more robust than its LLVM backend in terms of implementing the Zig language.
Why compete with LLVM on code generation? There are a handful of reasons, but mainly, because we can dramatically outperform LLVM at compilation speed.
Benchmark 1 (6 runs): zig build-exe hello.zig -fllvm
measurement mean ± σ min … max outliers delta
wall_time 918ms ± 32.8ms 892ms … 984ms 0 ( 0%) 0%
peak_rss 214MB ± 629KB 213MB … 215MB 0 ( 0%) 0%
cpu_cycles 4.53G ± 12.7M 4.52G … 4.55G 0 ( 0%) 0%
instructions 8.50G ± 3.27M 8.50G … 8.51G 0 ( 0%) 0%
cache_references 356M ± 1.52M 355M … 359M 0 ( 0%) 0%
cache_misses 75.6M ± 290K 75.3M … 76.1M 0 ( 0%) 0%
branch_misses 42.5M ± 49.2K 42.4M … 42.5M 0 ( 0%) 0%
Benchmark 2 (19 runs): zig build-exe hello.zig
measurement mean ± σ min … max outliers delta
wall_time 275ms ± 4.94ms 268ms … 283ms 0 ( 0%) ⚡- 70.1% ± 1.7%
peak_rss 137MB ± 677KB 135MB … 138MB 0 ( 0%) ⚡- 36.2% ± 0.3%
cpu_cycles 1.57G ± 9.60M 1.56G … 1.59G 0 ( 0%) ⚡- 65.2% ± 0.2%
instructions 3.21G ± 126K 3.21G … 3.21G 1 ( 5%) ⚡- 62.2% ± 0.0%
cache_references 112M ± 758K 110M … 113M 0 ( 0%) ⚡- 68.7% ± 0.3%
cache_misses 10.5M ± 102K 10.4M … 10.8M 1 ( 5%) ⚡- 86.1% ± 0.2%
branch_misses 9.22M ± 52.0K 9.14M … 9.31M 0 ( 0%) ⚡- 78.3% ± 0.1%
For a larger project like the Zig compiler itself, it takes the time down from 75 seconds to 20 seconds.
We’re only just getting started. We’ve already started work fully parallelizing code generation. We’re also just a few linker enhancements and bug fixes away from making incremental compilation stable and robust in combination with this backend. There is still low hanging fruit for improving the generated x86 code quality. And we’re looking at aarch64 next - work that is expected to be accelerated thanks to our new Legalize pass.
The CI has finished building the respective commit, so you can try this out yourself by fetching the latest master branch build from the download page.
Finally, here’s a gentle reminder that Zig Software Foundation is a 501(c)(3) non-profit that funds its development with donations from generous people like you. If you like what we’re doing, please help keep us financially sustainable!
Author: Loris Cro
I’ve released a few days ago a new video on YouTube where I show how to get started with the Zig build system for those who have not grokked it yet.
In the video I show how to create a package that exposes a Zig module and then how to import that module in another Zig project. After June I will add more videos to the series in order to cover more of the build system.
Here’s the video: https://youtu.be/jy7w_7JZYyw
Author: Alex Rønne Petersen
Pull requests #23835 and #23913 have now been merged. This means that, using zig cc
or zig build
, you can now build binaries targeting FreeBSD 14.0.0+ and NetBSD 10.1+ from any machine, just as you’ve been able to for Linux, macOS, and Windows for a long time now.
This builds on the strategy we were already using for glibc and will soon be using for other targets as well. For any given FreeBSD/NetBSD release, we build libc and related libraries for every supported target, and then extract public symbol information from the resulting ELF files. We then combine all that information into a very compact abilists
file that gets shipped with Zig. Finally, when the user asks to link libc while cross-compiling, we load the abilists
file and build a stub library for each constituent libc library (libc.so
, libm.so
, etc), making sure that it accurately reflects the symbols provided by libc for the target architecture and OS version, and has the expected soname. This is all quite similar to how the llvm-ifs tool works.
We currently import crt0 code from the latest known FreeBSD/NetBSD release and manually apply any patches needed to make it work with any OS version that we support cross-compilation to. This is necessary because the OS sometimes changes the crt0 ABI. We’d like to eventually reimplement the crt0 code in Zig.
We also ship FreeBSD/NetBSD system and libc headers with the Zig compiler. Unlike the stub libraries we produce, however, we always import headers from the latest version of the OS. This is because it would be far too space-inefficient to ship separate headers for every OS version, and we realistically don’t have the time to audit the headers on every import and add appropriate version guards to all new declarations. The good news, though, is that we do accept patches to add version guards when necessary; we’ve already had many contributions of this sort in our imported glibc headers.
Please take this for a spin and report any bugs you find!
We would like to also add support for OpenBSD libc and Dragonfly BSD libc, but because these BSDs cannot be conveniently cross-compiled from Linux, we need motivated users of them to chip in. Besides those, we are also looking into SerenityOS, Android, and Fuchsia libc support.
Author: Loris Cro
The official Zig website now builds using standalone Zine. A lot of code got rewritten so if you see regressions on the website, please open an issue. Regressions only please, thanks!
Normally a Zine update would not be worthy of a devlog entry, but the recent update to it was pretty big as Zine went from being a funky Zig build script to a standalone executable. If you were interested in Zine before but never got the time to try it out, this milestone is a great moment to give it a shot. Run zine init
to get a sample website that also implements a devlog for you out of the box.
P.S. I’ve also added dates to each entry on the page, people were asking for this for a while :^)
The 0.14.0 release is coming shortly. We didn’t get the release notes done yet, and I’m calling it a day.
Tomorrow morning I’ll make the tag, kick off the CI, and then work to finish the release notes while it builds.
I know there were a lot of things that sadly didn’t make the cut. Let’s try to get them into 0.14.1 or 0.15.0. Meanwhile, there are a ton of major and minor enhancements that have already landed, and will debut tomorrow.
Author: David Rubin
Lately, I’ve been extensively working with C interop, and one thing that’s been sorely missing is clear error messages from UBSan. When compiling C with zig cc
, Zig provides better defaults, including implicitly enabling -fsanitize=undefined
. This has been great for catching subtle bugs and makes working with C more bearable. However, due to the lack of a UBSan runtime, all undefined behavior was previously caught with a trap
instruction.
For example, consider this example C program:
#include <stdio.h> int foo(int x, int y) { return x + y;
} int main() { int result = foo(0x7fffffff, 0x7fffffff); printf("%d\n", result);
}
Running this with zig cc
used to result in an unhelpful error:
$ zig run test.c -lc
fish: Job 1, 'zig run empty.c -lc' terminated by signal SIGILL (Illegal instruction)
Not exactly informative! To understand what went wrong, you’d have to run the executable in a debugger. Even then, tracking down the root cause could be daunting. Many newcomers ran into this Illegal instruction
error without realizing that UBSan was enabled by default, leading to confusion. This issue was common enough to warrant a dedicated Wiki page.
With the new UBSan runtime merged, the experience has completely changed. Now instead of an obscure SIGILL
, you get a much more helpful error message:
$ zig run test.c -lc
thread 208135 panic: signed integer overflow: 2147483647 + 2147483647 cannot be represented in type 'int'
/home/david/Code/zig/build/test.c:4:14: 0x1013e41 in foo (test.c)
return x + y;
^
/home/david/Code/zig/build/test.c:8:18: 0x1013e63 in main (test.c)
int result = foo(0x7fffffff, 0x7fffffff);
^
../sysdeps/nptl/libc_start_call_main.h:58:16: 0x7fca4c42e1c9 in __libc_start_call_main (../sysdeps/x86/libc-start.c)
../csu/libc-start.c:360:3: 0x7fca4c42e28a in __libc_start_main_impl (../sysdeps/x86/libc-start.c)
???:?:?: 0x1013de4 in ??? (???)
???:?:?: 0x0 in ??? (???)
fish: Job 1, 'zig run test.c -lc' terminated by signal SIGABRT (Abort)
Now, not only do we see what went wrong (signed integer overflow), but we also see where it happened – two critical pieces of information that were previously missing.
While the new runtime vastly improves debugging, there are still two features that LLVM’s UBSan runtime provides which ours doesn’t support yet:
assume_aligned
and __nonnull
. This should be relatively straightforward to add, and contributions are welcome!If you’ve ever been frustrated by cryptic SIGILL
errors while trying out Zig, this update should make debugging undefined behavior a lot easier!
Author: Andrew Kelley
Alright, I know I’m supposed to be focused on issue triage and merging PRs for the upcoming release this month, but in my defense, I do some of my best work while procrastinating.
Jokes aside, this week we had CI failures due to Zig’s debug allocator creating too many memory mappings. This was interfering with Jacob’s work on the x86 backend, so I spent the time to rework the debug allocator.
Since this was a chance to eliminate the dependency on a compile-time known page size, I based my work on contributor archbirdplus’s patch to add runtime-known page size support to the Zig standard library. With this change landed, it means Zig finally works on Asahi Linux. My fault for originally making page size compile-time known. Sorry about that!
Along with detecting page size at runtime, the new implementation no longer memsets each page to 0xaa bytes then back to 0x00 bytes, no longer searches when freeing, and no longer depends on a treap data structure. Instead, the allocation metadata is stored inline, on the page, using a pre-cached lookup table that is computed at compile-time:
fn calculateSlotCount(size_class_index: usize) SlotIndex { const size_class = @as(usize, 1) << @as(Log2USize, @intCast(size_class_index)); var lower: usize = 1 << minimum_slots_per_bucket_log2; var upper: usize = (page_size - bucketSize(lower)) / size_class; while (upper > lower) { const proposed: usize = lower + (upper - lower) / 2; if (proposed == lower) return lower; const slots_end = proposed * size_class; const header_begin = mem.alignForward(usize, slots_end, @alignOf(BucketHeader)); const end = header_begin + bucketSize(proposed); if (end > page_size) { upper = proposed - 1; } else { lower = proposed; } } const slots_end = lower * size_class; const header_begin = mem.alignForward(usize, slots_end, @alignOf(BucketHeader)); const end = header_begin + bucketSize(lower); assert(end <= page_size); return lower;
}
It’s pretty nice because you can tweak some global constants and then get optimal slot sizes. That assert at the end means if the constraints could not be satisfied you get a compile error. Meanwhile in C land, equivalent code has to resort to handcrafted lookup tables. Just look at the top of malloc.c from musl:
const uint16_t size_classes[] = { 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 18, 20, 25, 31, 36, 42, 50, 63, 72, 84, 102, 127, 146, 170, 204, 255, 292, 340, 409, 511, 584, 682, 818, 1023, 1169, 1364, 1637, 2047, 2340, 2730, 3276, 4095, 4680, 5460, 6552, 8191,
};
Not nearly as nice to experiment with different size classes. The water’s warm, Rich, come on in! 😛
Anyway, as a result of reworking this allocator, not only does it work with runtime-known page size, and avoid creating too many memory mappings, it also performs significantly better than before. The motivating test case for these changes was this degenerate ast-check task, with a debug compiler:
Benchmark 1 (3 runs): master/bin/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
measurement mean ± σ min … max outliers delta
wall_time 22.8s ± 184ms 22.6s … 22.9s 0 ( 0%) 0%
peak_rss 58.6MB ± 77.5KB 58.5MB … 58.6MB 0 ( 0%) 0%
cpu_cycles 38.1G ± 84.7M 38.0G … 38.2G 0 ( 0%) 0%
instructions 27.7G ± 16.6K 27.7G … 27.7G 0 ( 0%) 0%
cache_references 1.08G ± 4.40M 1.07G … 1.08G 0 ( 0%) 0%
cache_misses 7.54M ± 1.39M 6.51M … 9.12M 0 ( 0%) 0%
branch_misses 165M ± 454K 165M … 166M 0 ( 0%) 0%
Benchmark 2 (3 runs): branch/bin/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
measurement mean ± σ min … max outliers delta
wall_time 20.5s ± 95.8ms 20.4s … 20.6s 0 ( 0%) ⚡- 10.1% ± 1.5%
peak_rss 54.9MB ± 303KB 54.6MB … 55.1MB 0 ( 0%) ⚡- 6.2% ± 0.9%
cpu_cycles 34.8G ± 85.2M 34.7G … 34.9G 0 ( 0%) ⚡- 8.6% ± 0.5%
instructions 25.2G ± 2.21M 25.2G … 25.2G 0 ( 0%) ⚡- 8.8% ± 0.0%
cache_references 1.02G ± 195M 902M … 1.24G 0 ( 0%) - 5.8% ± 29.0%
cache_misses 4.57M ± 934K 3.93M … 5.64M 0 ( 0%) ⚡- 39.4% ± 35.6%
branch_misses 142M ± 183K 142M … 142M 0 ( 0%) ⚡- 14.1% ± 0.5%
I didn’t stop there, however. Even though I had release tasks to get back to, this left me itching to make a fast allocator - one that was designed for multi-threaded applications built in ReleaseFast mode.
It’s a tricky problem. A fast allocator needs to avoid contention by storing thread-local state, however, it does not directly learn when a thread exits, so one thread must periodically attempt to reclaim another thread’s resources. There is also the producer-consumer pattern - one thread only allocates while one thread only frees. A naive implementation would never reclaim this memory.
Inspiration struck, and 200 lines of code later I had a working implementation… after Jacob helped me find a couple logic bugs.
I created Where in the World Did Carmen’s Memory Go? and used it to test a couple specific usage patterns. Idea here is to over time collect a robust test suite, do fuzzing, benchmarking, etc., to make it easier to try out new Allocator ideas in Zig.
After getting good scores on those contrived tests, I turned to the real world use cases of the Zig compiler itself. Since it can be built with and without libc, it’s a great way to test the performance delta between the two.
Here’s that same degenerate case above, but with a release build of the compiler - glibc zig vs no libc zig:
Benchmark 1 (32 runs): glibc/bin/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
measurement mean ± σ min … max outliers delta
wall_time 156ms ± 6.58ms 151ms … 173ms 4 (13%) 0%
peak_rss 45.0MB ± 20.9KB 45.0MB … 45.1MB 1 ( 3%) 0%
cpu_cycles 766M ± 10.2M 754M … 796M 0 ( 0%) 0%
instructions 3.19G ± 12.7 3.19G … 3.19G 0 ( 0%) 0%
cache_references 4.12M ± 498K 3.88M … 6.13M 3 ( 9%) 0%
cache_misses 128K ± 2.42K 125K … 134K 0 ( 0%) 0%
branch_misses 1.14M ± 215K 925K … 1.43M 0 ( 0%) 0%
Benchmark 2 (34 runs): SmpAllocator/bin/zig ast-check ../lib/compiler_rt/udivmodti4_test.zig
measurement mean ± σ min … max outliers delta
wall_time 149ms ± 1.87ms 146ms … 156ms 1 ( 3%) ⚡- 4.9% ± 1.5%
peak_rss 39.6MB ± 141KB 38.8MB … 39.6MB 2 ( 6%) ⚡- 12.1% ± 0.1%
cpu_cycles 750M ± 3.77M 744M … 756M 0 ( 0%) ⚡- 2.1% ± 0.5%
instructions 3.05G ± 11.5 3.05G … 3.05G 0 ( 0%) ⚡- 4.5% ± 0.0%
cache_references 2.94M ± 99.2K 2.88M … 3.36M 4 (12%) ⚡- 28.7% ± 4.2%
cache_misses 48.2K ± 1.07K 45.6K … 52.1K 2 ( 6%) ⚡- 62.4% ± 0.7%
branch_misses 890K ± 28.8K 862K … 1.02M 2 ( 6%) ⚡- 21.8% ± 6.5%
Outperforming glibc!
And finally here’s the entire compiler building itself:
Benchmark 1 (3 runs): glibc/bin/zig build -Dno-lib -p trash
measurement mean ± σ min … max outliers delta
wall_time 12.2s ± 99.4ms 12.1s … 12.3s 0 ( 0%) 0%
peak_rss 975MB ± 21.7MB 951MB … 993MB 0 ( 0%) 0%
cpu_cycles 88.7G ± 68.3M 88.7G … 88.8G 0 ( 0%) 0%
instructions 188G ± 1.40M 188G … 188G 0 ( 0%) 0%
cache_references 5.88G ± 33.2M 5.84G … 5.90G 0 ( 0%) 0%
cache_misses 383M ± 2.26M 381M … 385M 0 ( 0%) 0%
branch_misses 368M ± 1.77M 366M … 369M 0 ( 0%) 0%
Benchmark 2 (3 runs): SmpAllocator/fast/bin/zig build -Dno-lib -p trash
measurement mean ± σ min … max outliers delta
wall_time 12.2s ± 49.0ms 12.2s … 12.3s 0 ( 0%) + 0.0% ± 1.5%
peak_rss 953MB ± 3.47MB 950MB … 957MB 0 ( 0%) - 2.2% ± 3.6%
cpu_cycles 88.4G ± 165M 88.2G … 88.6G 0 ( 0%) - 0.4% ± 0.3%
instructions 181G ± 6.31M 181G … 181G 0 ( 0%) ⚡- 3.9% ± 0.0%
cache_references 5.48G ± 17.5M 5.46G … 5.50G 0 ( 0%) ⚡- 6.9% ± 1.0%
cache_misses 386M ± 1.85M 384M … 388M 0 ( 0%) + 0.6% ± 1.2%
branch_misses 377M ± 899K 377M … 378M 0 ( 0%) 💩+ 2.6% ± 0.9%
I feel that this is a key moment in the Zig project’s trajectory. This last piece of the puzzle marks the point at which the language and standard library has become strictly better to use than C and libc.
While other languages build on top of libc, Zig instead has conquered it!
Author: Alex Rønne Petersen
One of the major things Jacob has been working on is good debugging support for Zig. This includes an LLDB fork with enhancements for the Zig language, and is primarily intended for use with Zig’s self-hosted backends. With the self-hosted x86_64 backend becoming much more usable in the upcoming 0.14.0 release, I decided to type up a wiki page with instructions for building and using the fork.
If you’re already trying out Zig’s self-hosted backend in your workflow, please take the LLDB fork for a spin and see how it works for you.
As far as I know, Zig has a bunch of things in the works for a better development experience. Almost every day there's something being worked on - like https://github.com/ziglang/zig/pull/24124 just now. I know that Zig had some plans in the past to also work on hot code swapping. At this rate of development, I wouldn't be surprised if hot code swapping was functional within a year on x86_64.
The biggest pain point I personally have with Zig right now is the speed of `comptime` - The compiler has a lot of work to do here, and running a brainF** DSL at compile-time is pretty slow (speaking from experience - it was a really funny experiment). Will we have improvements to this section of the compiler any time soon?
Overall I'm really hyped for these new backends that Zig is introducing. Can't wait to make my own URCL (https://github.com/ModPunchtree/URCL) backend for Zig. ;)
For comptime perf improvements, I know what needs to be done - I even started working on a branch a long time ago. Unfortunately, it is going to require reworking a lot of the semantic analysis code. Something that absolutely can, should, and will be done, but is competing with other priorities.
Thank you for working so hard on Zig. Really looking forward to Zig 1.0 taking the system programming language throne.
I am not sure, but why can't C,Rust and Zig with others (like Ada,Odin etc.) and of course C++ (how did I forget it?) just coexist.
Not sure why but I was definitely getting some game of thrones vibes from your comment and I would love to see some competition but I don't know, Just code in whatever is productive to you while being systems programming language I guess.
But I don't know low level languages so please, take my words at 2 cents.
I am just watching the Game of Throne series right now, so this comment sounds funnier than it should to me :D.
The fight for the Iron Throne, lots of self-proclaimed kings trying to take it... C is like King Joffrey, Rust is maybe Robb Stark?! And Zig... probably princess Daenerys with her dragons.
The industry has the resources to sustain maybe two and a half proper IDEs with debuggers, profilers etc.. So much as we might wish otherwise, language popularity matters. The likes of LSP mitigate this to a certain extent, but at the moment they only go so far.
All system programming languages mentioned by GP share the same set of debuggers and profilers, though. It's not very language specific.
That's where extensible IDEs like VSCode (and with it the Language Server Protocol and Debug Adapter Protocol) come in.
It's not perfect yet, but I can do C/C++/ObjC, Zig, Odin, C3, Nim, Rust, JS/TS, Python, etc... development and debugging all in the same IDE, and even within the same project.
For Virgil I went through three different compile-time interpreters. The first walked a tree-like IR that predated SSA. Then, after SSA, I designed a linked-list-like representation specifically for interpretation speed. After dozens of little discrepancies between this custom interpreter and compile output, I finally got rid of it and wrote an interpreter that works directly on the SSA intermediate representation. In the worst case, the SSA interpreter is only 2X slower than the custom interpreter. In the best case, it's faster, and saves a translation step. I feel it is worth it because of the maintenance burden and bugs.
Have you considered hiring people to help you with these tasks so you can work in parallel and get more done quicker?
It's a funny question because, as far as I'm aware, Zig Software Foundation is the only organization among its peers that spends the bulk of its revenue directly paying contributors for their time - something I'm quite proud of.
Oh so then you're already doing that. Well then that's fine, the tasks will get done when they get done then.
>>spends the bulk of its revenue directly paying contributors
Same with the FreeBSD Foundation (P: OS Improvements):
https://freebsdfoundation.org/wp-content/uploads/2024/03/Bud...
FWIW here's what that looks like for Zig: https://ziglang.org/news/2024-financials/
Super happy about those two :)
Other Foundations are more like the "Penguin Foundation".....
Hot code swapping will be huge for gamedev. The idea that Zig will basically support it by default with a compiler flag is wild. Try doing that, clang.
Totally agree with that - although even right now zig is excellent for gamedev, considering it's performant, uses LLVM (in release modes), can compile REALLY FAST (in debug mode), it has near-seamless C integration, and the language itself is really pleasant to use (my opinion).
Visual C++ and tools like Live++ have been doing it for years.
Maybe people should occasionally move away from their UNIX and vi ways.
>Maybe people should occasionally move away from their UNIX and vi ways.
Maybe when something better comes up, but since you never invested one single minute on improving Inferno we have to wait for another Hero ;)
Yes, at a huge cost. That only works on Microsoft platforms.
MSVC++ is a nice compiler, sure, but it's not GCC or Clang. It's very easy to have a great feature set when you purposefully cut down your features to the bare minimum. It's like a high-end restaurant. The menu is concise and small and high quality, but what if I'm allergic to shellfish?
GCC and Clang have completely different goals, and they're much more ambitious. The upside of that is that they work on a lot of different platforms. The downside is that the quality of features may be lower, or some features may be missing.
I ended up switching from Zig to C# for a tiny game project because C# already supports cross-platform hot reload by default. (It’s just `dotnet watch`.) Coupled with cross-compilation, AOT compilation and pretty good C interop, C# has been great so far.
Why more games aren’t being developed in lisp is… perhaps not beyond me, but game development missed a turn a couple times.
That is basically what they do when using Lua, Python, C#, Java, but with less parenthesis, which apparently are too scary for some folks, moving from print(x) to (print x).
There was a famous game with Lisp scripting, Abuse, and Naughty Dog used to have Game Oriented Assembly Lisp.
I had exactly the same title in mind, remember my very young self being in shock when I learned that it was lisp. If you didn't look under the hood you'd never be able to tell, it just worked.
Is comptime slowness really an issue? I'm building a JSON-RPC library and heavily relying on comptime to be able to dispatch a JSON request to arbitrary function. Due to strict static typing, there's no way to dynamically dispatch to a function with arbitrary parameters in runtime. The only way I found was figuring the function type mapping during compile time using comptime. I'm sure it will blow up the code size with additional copies of the comptimed code with each arbitrary function.
Yes, last time I checked, Zig's comptime was 20x slower than interpreted Python. Parsing a non-trivial JSON file at comptime is excrutiatingly slow and can take minutes.
ouch, that perf numbers do hurt
> Parsing a non-trivial JSON file at comptime is excrutiatingly slow
Nevertheless, impressive that you can do so!
I would argue that it's not meaningful to do so for larger files with comptime as there doesn't seem to be a need for parsing JSON like the target platform would (comptime emulates it) - I expect it to be independent. You're also not supposed to do I/O using comptime, and @embedFile kind of falls under that. I suppose it would be better to write a build.zig for this particularly use case, which I think would then also be able to run fast native code?
Is it easy to build out a custom backend? I haven't looked at it yet but I'd like to try some experiments with that -- to be specific, I think that I can build out a backend that will consume AIR and produce a memory safety report. (it would identify if you're using undefined values, stack pointer escape, use after free, double free, alias xor mut)
URCL is sending me down a rabbithole. Haven't looked super deeply yet, but the most hilarious timeline would be that an IR built for Minecraft becomes a viable compilation target for languages.
Better spend the time at comptime than at runtime. Always a benefit
This is already such a huge achievement, yet as the devlog notes, there is plenty more to come! The idea of a compiler modifying only the parts of a binary that it needs to during compilation is simultaneously refreshing and totally wild, yet now squarely within reach of the Zig project. Exciting times ahead.
> For a larger project like the Zig compiler itself, it takes the time down from 75 seconds to 20 seconds. We’re only just getting started.
Excited to see what he can do with this. He seems like a really smart guy.
What's the package management look like? I tried to get an app with QuickJS + SDL3 working, but the mess of C++ pushed me to Rust where it all just works. Would be glad to try it out in Zig too.
Package management in Zig is more manual than Rust, involving fetching the package URL using the CLI, then importing the module in your build script. This has its upsides - you can depend on arbitrary archives, so lots of Zig packages of C libraries are just a build script with a dependency on a unmodified tarball release. But obviously it's a little trickier for beginners.
SDL3 has both a native Zig wrapper: https://github.com/Gota7/zig-sdl3
And a more basic repackaging on the C library/API: https://github.com/castholm/SDL
For QuickJS, the only option is the C API: https://github.com/allyourcodebase/quickjs-ng
Zig makes it really easy to use C packages directly like this, though Zig's types are much more strict so you'll inevitably be doing a lot of casting when interacting with the API
It is pretty easy to interface stuff like QuickJS C API. I had a POC here from last year: https://github.com/eknkc/zquickjs
Even this is pretty usable, handling value conversions and such thanks to comptime. (Take a look at the tests here: https://github.com/eknkc/zquickjs/blob/master/src/root.zig)
It's also worth pointing out that the Zig std library covers a lot more than the rust one. No need for things like rustix, rand, hashbrown, and a few others I always have to add whenever I do rust stuff.
You add hashbrown as an explicit dependency? The standard library HashMap is a re-export of hashbrown. Doesn’t it work for you?
Can’t speak for the op but there’s a number of high performance interfaces that avoid redundant computations that are only available directly from hashbrown.
huh, does it? I always add it so I can make non-deterministic hashmaps in rust. oh and you need one more one for the hashing function I think.
But I did not know hashmap re-exported hashbrown, thanks.
Yep, they’re the same since Rust 1.36 (Jul 2019) - https://blog.rust-lang.org/2019/07/04/Rust-1.36.0/
https://doc.rust-lang.org/std/?search=hashbrown
looks like there's no way to access it, outside of hashmap.
Though maybe you just need the third party hasher and you can call with_hasher.
IDK man there's a lot going on with rust.
Yes that’s exactly it. If speed is a priority and the input is trusted, change the hasher to something faster like ahash. The ahash crate makes this easy.
The dmd D compiler can compile itself (debug build):
real 0m18.444s user 0m17.408s sys 0m1.688s
On an ancient processor (it runs so fast I just never upgraded it):
cat /proc/cpuinfo processor : 0 vendor_id : AuthenticAMD cpu family : 15 model : 107 model name : AMD Athlon(tm) 64 X2 Dual Core Processor 4400+ stepping : 2 cpu MHz : 2299.674 cache size : 512 KB physical id : 0 siblings : 2 core id : 0 cpu cores : 2 apicid : 0 initial apicid : 0 fpu : yes
18s eh? we're looking at 15s in https://github.com/ziglang/zig/pull/24124
oh, and by the way that includes the package manager, so the compile time accounts for:
* HTTP
* TLS (including aegis-128l, aegis-256, aes128-gcm, aes256-gcm, chacha20poly1305)
* deflate, zstd, and xz
* git protocol
Nice to hear from you, Andrew! I assume you're using a machine newer than 15 years ago :-)
I suppose it would compile faster if I didn't have symbolic debug info turned on.
Anyhow, our users often use dmd for development because of the high speed turnaround, and gdc/ldc for deployment with their more optimized code gen.
You too! Yeah I think that was a great call. I took inspiration from D for sure when aiming for this milestone that we reached today.
Some people say you should use an old computer for development to help you write faster code. I say you should use a new computer for development, and write the fastest code you possibly can by exploiting all the new CPU instructions and optimizing for newer caching characteristics.
I'm still in the camp of using computers our users tend to have.
Also, self-compile times are strongly related to how much code there is in the compiler, not just the compile speed.
I also confess to being a bit jaded on this. I've been generating code from 8086 processors to the latest. Which instructions and combinations are faster is always flip-flopping around from chip to chip. So I leave it to the gdc/ldc compilers for the top shelf speed, and just try to make the code gen bulletproof and do a solid job.
Working on the new AArch64 has been quite fun. I'll be doing a presentation on it later in the summer. My target machine is a Raspberry Pi, which is a great machine.
Having the two code generators side by side also significantly increased the build times, because it's a lot more code being compiled.
Fair enough, and yeah I hear you on the compilation cost of all the targets. We don't have aarch64 yet but in addition to x86_64 we do have an LLVM backend, a C backend, a SPIR-V backend, WebAssembly backend, RISC-V backend, and sparc backend. All that plus the stuff I mentioned earlier in 15s on a modern laptop.
I considered a C backend at one time, but C is not expressive enough. The generated code would be ugly. For example, exception handling. D's heavy reliance on common blocks in the object code is another issue. C doesn't support nested functions (static links). And so on.
Never found a user who asked for that, either :-/
Some users want a C backend, not for maintainability reasons, but for the ability to compile on platforms that have nothing but a C compiler. The maintainability or aesthetics of the C is irrelevant, it's like another intermediate representation.
Can confirm, Zig's generated C is extremely ugly. We literally treat it as an object format [1].
The MSVC limitations are maddening, from how short string literals must be, to the complete lack of inline assembly when targeting x86_64.
[1]: https://ziglang.org/documentation/0.14.1/std/#std.Target.Obj...
I bet it would be easier to write a code gen for such platforms than to wrassal generated C code and work around the endless problems.
Anyhow, one of the curious features of D is its ability to translate C code to D code. Curious as it was never intentionally designed, it was discovered by one of our users.
D has the ability to create a .di file from a .d file, which is analogous to writing a .h file from a .c file. When D gained the ability to compile C files, you just ask it to create a .di file, and voila! the C code translated to D!
Maybe, I don’t use those platforms and so I don’t know from experience, I just know that’s why people asked us in Rust.
I somehow missed that D has that! I try to read the forums now and again, but I should keep more active tabs on how stuff is going :)
I think it would be a good idea to have some kind of "speedbump" tool that makes your software slower, but in a way where optimizing it would also optimize the faster version.
I don't know whether this is technically feasible, maybe you could run it on CPUs with good power management and force them to underclock or something.
You could use Qemu to emulate an older CPU, you would need to disable kvm with -no-kvm. There's also a throttle option I found while googling this.
15s is fast, wow.
Do you have any metrics on which parts of the whole compiler, std, package manager, etc. take the longest to compile? How much does comptime slowness affect the total build time?
[edited to fix a formatting problem, sorry!]
Well, one interesting number is what happens when you limit the compiler to this feature set:
* Compilation front-end (tokenizing/parsing, IR lowering, semantic analysis)
* Our own ("self-hosted") x86_64 code generator
* Our own ("self-hosted") ELF linker
...so, that's not including the LLVM backend and LLD linker integration, the package manager, `zig build`, etc. Building this subset of the compiler (on the branch which the 15 second figure is from) takes around 9 seconds. So, 6 seconds quicker.
This is essentially a somewhat-educated guess, so it could be EXTREMELY wrong, but of those 6s, I would imagine that around 1-2 are spent on all the other codegen backends and linkers (they aren't too complex and most of them are fairly incomplete), and probably a good 3s or so are from package management, since that pulls in HTTP, TLS, zip+tar, etc. TLS in particular does bring in some of our std.crypto code which sometimes sucks up more compile time than it really should. The remaining few seconds can be attributed to some "everything else" catch-all.
Amusingly, when I did some slightly more in-depth analysis of compiler performance some time ago, I discovered that most of the compiler's time -- at least during semantic analysis -- is spent analyzing different calls to formatted printing (since they're effectively "templated at compile time" in Zig, so the compiler needs to do a non-trivial amount of work for every different-looking call to something like `std.log.info`). That's not actually hugely unreasonable IMO, because formatted printing is a super common operation, but it's an example of an area we could improve on (both in the compiler itself, and in the standard library by simplifying and speeding up `std.fmt`). This is one example of a case where `comptime` execution is a big contributor to compile times.
However, aside from that one caveat of `std.fmt`, I would say that `comptime` slowness isn't a huge deal for many projects. Really, it depends how much they use `comptime`. You can definitely feel the limited speed of `comptime` execution if you use it heavily (e.g. try to parse a big file at `comptime`). However, most codebases are more restrained in their use of `comptime`; it's like a spice, a bit is lovely, but you don't want to overdo it! As with any kind of metaprogramming, overuse of `comptime` can lead to horribly unreadable code, and many major Zig projects have a pretty tasteful approach to using `comptime` in the right places. So for something like the Zig compiler, the speed of `comptime` execution honestly doesn't factor in that much (aside from that `std.fmt` caveat discussed above). `comptime` is very closely tied in to general semantic analysis (things like type checking) in Zig's design, so we can't really draw any kind of clear line, but on the PR I'm taking these measurements against, the threading actually means that even if semantic analysis (i.e. `comptime` execution plus more stuff) were instantaneous, we wouldn't see a ridiculous performance boost, since semantic analysis is now running in parallel to code generation and linking, and those three phases are faiiirly balanced right now in terms of speed.
In general (note that I am biased, since I'm a major contributor to the project!), I find that the Zig compiler is honestly a fair bit faster than people give it credit for. Like, it might sound pretty awful that (even after these improvements), building a "Hello World" takes (for me) around 0.3s -- but unlike C (where libc is precompiled and just needs to be linked, so the C compiler literally has to handle only the `main` you wrote), the Zig compiler is actually freshly building standard library code to handle, for instance, debug info parsing and stack unwinding in the case of a panic (code which is actually sorta complicated!). Right now, you're essentially getting a clean build of these core standard library components every time you build your Zig project (this will be improved upon in the future with incremental compilation). We're still planning to make some huge improvements to compilation speed across the board of course -- as Andrew says, we're really only just getting started with the x86_64 backend -- but I think we've already got something pretty decently fast.
Always!
Dang, I'm far behind then, I haven't even downloaded QBE[1] yet!
> Dang
Better spelled as Dlang!
I wonder if there is a guide to do that.
When I tried compiling zig it would take ages because it would go through different stages (with the entirety of bootstraping from wasm)
im stunned that zig can compile itself in 75 seconds (even with llvm)
We used to have such fast compile times with Turbo Pascal, and other dialects, Modula-2, Oberon dialects, across 16 bit and early 32 bit home computers.
Then everything went south, with the languages that took over mainstream computing.
Not to disagree with you, but even C++ is going through great efforts to improve compile-times through C++20 modules and C++23 standard library modules (import std;). Although no compiler fully supports both, you can get an idea of how they can improve compile-times with clang and libc++
$ # No modules
$ clang++ -std=c++23 -stdlib=libc++ a.cpp # 4.8s
$ # With modules
$ clang++ -std=c++23 -stdlib=libc++ --precompile -o std.pcm /path/to/libc++/v1/std.cppm # 4.6s but this is done once
$ clang++ -std=c++23 -stdlib=libc++ -fmodule-file=std=std.pcm b.cpp # 1.5s
a.cpp and b.cpp are equivalent but b.cpp does `import std;` and a.cpp imports every standard C++ header file (same thing as import std, you can find them in libc++' std.cppm).Notice that this is an extreme example since we're importing the whole standard library and is actually discouraged [^1]. Instead you can get through the day with just these flags: `-stdlib=libc++ -fimplicit-modules -fimplicit-module-maps` and of course -std=c++20 or later, no extra files/commands required! but you are only restricted to doing import <vector>; and such, no import std.
[^1]: non-standard headers like `bits/stdc++.h` which does the same thing (#including the whole standard library) is what is actually discouraged because a. non-standard and b. compile-times, but I can see `import std` solving these two and being encouraged once it's widely available!
As big fan of C++ modules (see my github), we are decades away of widespread adoption, unfortunately.
See regular discussions on C++ reddit, regarding state of modules support across the ecosystem.
C++ will be a very useful, fast, safe, and productive language in 2070.
Am I wrong about this?
Their algorithms were simpler.
Their output was simpler.
As their complexity grew, proportionately did program performance.
Not to mention adding language convenience features (generics, closures).
tinycc is still fast. all the current single-pass compilers are fast.
Agreed, I would say the main problem is lack of focus on developer productivity.
Ok, the goalpost has moved on with what -O0 is expected to deliver in machine code quality, lets then have something like -ffast-compile, or interpreter/jit as alternative toolchain in the box.
Practical example from D land, compile D with dmd during development, use gdc or ldc for release.