Does the vector index go cold under churn? We tried to break it
A v0.2.1 bug let the HNSW index go cold under heavy write-and-flush churn — semantic search silently stopped finding neighbours. Here's the soak test that reproduces it and proves v0.2.2 fixed it.
The worst bugs are the silent ones. A crash pages you; a cache that quietly stops finding semantic neighbours just slowly stops saving you money, and nobody notices until the bill arrives. That was the shape of a P0 we hit in v0.2.1: under heavy churn, the HNSW vector index could go 'cold' — still up, still answering, but no longer returning the neighbours it should.
So we wrote a soak test designed to trigger exactly that condition: 24 tenants, 120 operations each, three rounds — roughly 1,200 stores interleaved with flushes, driving the live vector count up past 2,200. Before each round and after the churn, it asks the same similarity question and counts the neighbours found.
The index stayed warm: same neighbour found before and after. P0 fixed in v0.2.2.
The fix tied the HNSW index's lifecycle to the store's flush-and-compaction cycle properly, so vectors survive the churn that previously orphaned them. The soak test is now a permanent part of the harness — the bug that hid in silence has a loud, automated witness that runs on every image.
The cure for a silent failure is a noisy test. We don't trust the index to stay warm; we make a robot try to freeze it on every build.