'Newt's Bunker' in 3D purple text on a wavey blue background
Cinnamoroll peaking their head with a blinking heart next to a 'Cinnamoroll' title card. Pixel art rendition of Lucario next to a 'Lucario' title card Pixel art rendition of Eevee next to an 'Eevee' title card on a blinking pink background 'Raichu' title card surrounded two Raichus 'RAWR RAWR RAWR' blinking pink 'Born to be silly' in rainbow strobing text 'im too silly!' on a pink background next to a bouncing Kirby 'HALF-LIFE FAN' in orange text on a black background with the 'a' in 'HALF' represented with a lambda symbol. 'TERRARIA PLAYER' on a Terraria styled sky background next to the Eye of Cthulu 'Computer Nerd' on a blinking blue background surrounded by two computers with monitors displaying randomly colored noise. 'I DREAM IN VECTORS,' the text is alternating between 'I DREAM' and 'IN VECTORS.' The background is strobing between a rainbow of colors. 'I love my computer <3' in pink text on a pink background next to a blinking pink computer. 'Math Nerd' on a blinking blue and white background 'Creative' on a blinking pink and white background. A pixel art bear holding an artist's palette is on the left, and a larger pixel art artist's palette is on the right. 'Anti-social' on a blinking white, blue, and black background ':3 XD :p B) :V O_O' show up in sequence followed by 'I SUPPORT EMOTICONS'
Ness as a dog or cat-like fursona, wearing a red T-shirt, a green sweater, and flip-flops. Pink text below her says 'I like computers :3'. Ness is hugging her whole PC setup in her arms, display, keyboard, mouse, and PC included. She's surrounded by two pink hearts.

Compilers: Query, Key, Value? (draft)

Compilers are very pointer heavy, as they often use linked lists for lists of operations, trees, graphs, and if they use it, single static assignment (SSA) form is riddled with pointers (SSA is a program representation where the result of an operation is a pointer to that operation, essentially meaning that a program in SSA is a graph with ‘no’ statements). This is a huge blow to performance, as compilers like LLVM (used in Clang) and GCC spend an enormous amount of time dereferencing pointers! Arrays are used, but heavy pointer use reigns supreme in compiler implementations.

However, I believe that this might change in the future. I’ve found that some recent compiler implementations have been moving towards an alternative implementation strategy known as data oriented design (and in compilers, the term ‘data oriented compiler design’ is sometimes thrown around). The idea is that instead of utilizing pointers extensively through the compiler, many data structures are flattened into a representation that fits nicely into an array (and yes, that even includes trees and graphs). Some recent compiler implementations that have moved towards data oriented design are the Carbon and Zig language compilers. The benefit of this new paradigm is primarily performance due to increased data locality. To summarize why data locality leads to improved performance: with pointers, the processor constantly is jumping around memory, but if it can stay in the same area of memory, then it can cache that general area of memory (and cached memory accesses are significantly faster!).

Recently I’ve been writing a toy compiler in the Rust programming language, and I’ve actually been utilizing this data-oriented approach…but not for performance! You see, data-oriented design can actually lead to better memory safety if done right. In my implementation, I represent a graph as an array of nodes. Instead of pointers, I use indices into that array. Where before you would need to separately allocate each node, the array only needs to be allocated once! This makes it much easier to be certain you haven’t made a memory allocation mistake. In my case, making my data structure’s memory management easier to reason about is much more important than in a language with manual memory management like C++ …because Rust doesn’t let me make mistakes. If the compiler can’t figure out where the life of every variable begins and ends, it simply refuses to compile the program. So in the ‘safe’ subset of Rust, I really don’t have any other option than to use data-oriented design (in this case, as is very often the case, Rust’s strictness encourages you to write better and more performant code, which can be both annoying and refreshing).

Also, here’s a fun fact: ‘data-oriented design’ originated from the field of game development! A common architecture in that field is ‘Entity-Component-System’ (ECS, and no, I did not make a grammatical error because ECS is not a ‘system’), which is an architectural design philosophy I’ve found very fascinating recently. I feel like the overlap between ECS, arena/stack/pool allocation, data oriented design/compiler design, and (if you bend your mind a little bit) even relational databases is perhaps larger than most people realize, which I think is just absolutely fascinating.