Engineering notes

Occasional writing on inference, compilation, and shipping models that stay small.

Why MTEB Scores Don't Predict Performance on Your Data

Benchmark leaderboards measure generalization across tasks you'll never run. Evaluate embedding models on the only distribution that matters — yours.

Read more →

Planning Memory at Compile Time

A garbage collector is a bet that you cannot predict your allocations. For a static inference graph, you can — so why pay for the bet?

Read more →

Distroless Is Not Enough

Stripping the OS helps, but the framework is the weight. The smallest container is the one that ships a single binary and nothing else.

Read more →