Why MTEB Scores Don't Predict Performance on Your Data
Benchmark leaderboards measure generalization across tasks you'll never run. Evaluate embedding models on the only distribution that matters — yours.
Read more →Occasional writing on inference, compilation, and shipping models that stay small.
Benchmark leaderboards measure generalization across tasks you'll never run. Evaluate embedding models on the only distribution that matters — yours.
Read more →A garbage collector is a bet that you cannot predict your allocations. For a static inference graph, you can — so why pay for the bet?
Read more →Stripping the OS helps, but the framework is the weight. The smallest container is the one that ships a single binary and nothing else.
Read more →