ZeptonML lowers an exported PyTorch graph to a self-contained native binary. Memory is planned at compile time; the output links nothing it does not use.
Allocation, layout, and op fusion are decided once, ahead of time — not rediscovered every inference.
Only the operators your graph actually calls are emitted. Nothing else is linked.