Zvec: A lightweight, fast, in-process vector database
211 points by dvrp 3 days ago | 34 comments
luoxiaojian 11 hours ago
Author here. Thanks everyone for the interest and thoughtful questions! I've noticed many of you are curious about how we achieved the performance numbers and how we compare to other solutions. We're currently working on a detailed blog post that walks through our optimization journey—expect it after the Lunar New Year. We'll also be adding more benchmark comparisons to the repo and blog soon. Stay tuned!
replyaktuel 16 hours ago
I recently discovered https://www.cozodb.org/ which also vector search built-in. I just started some experiments with it but so far I'm quite impressed. It's not in active development atm but it seems already well rounded for what it is so depending on the use-case it does not matter or may even be an advantage. Also with today's coding agent it shouldn't be too hard to scratch your own itch if needed.
replycmrdporcupine 8 hours ago
cozodb is quite impressive and I've wondered about the funding sources etc on it, if any. I've watched it for some years and the developer seems to have made a real passion project out of it but you're right it seems development has tapered off.
replyOfficialTurkey 22 hours ago
I haven't been following the vector db space closely for a couple years now, but I find it strange that they didn't compare their performance to the newest generation serverless vector dbs: Pinecone Serverless, turbopuffer, Chroma (distributed, not the original single-node implementation). I understand that those are (mostly) hosted products so there's not a true apples-to-apples comparison with the same hardware, but surely the most interesting numbers are cost vs performance.
replycjonas 23 hours ago
How does this compare to duckdbs vector capabilities (vss extension)?
replyjgalt212 20 hours ago
Yes, nothing on that or sqlite-vec (both of which seem to be apples to apples comparisons).
replymceachen 20 hours ago
I maintain a fork of sqlite-vec (because there hasn't been activity on the main repo for more than a year): sqlite-vec is great for smaller dimensionality or smaller cardinality datasets, but know that it's brute-force, and query latency scales exactly linearly. You only avoid full table scans if you add filterable columns to your vec0 table and include them in your WHERE clause. There's no probabilistic lookup algorithm in sqlite-vec.
replyskybrian 24 hours ago
Are these sort of similarity searches useful for classifying text?
replyCuriouslyC 24 hours ago
Embeddings are good at partitioning document stores at a coarse grained level, and they can be very useful for documents where there's a lot of keyword overlap and the semantic differentiation is distributed. They're definitely not a good primary recall mechanism, and they often don't even fully pull weight for their cost in hybrid setups, so it's worth doing evals for your specific use case.
replystephantul 16 hours ago
Yes. This is known as a knn classifier. Knn classifiers are usually worse than other simple classifiers, but trivial to update and use.
replySee e.g., https://scikit-learn.org/stable/auto_examples/neighbors/plot...
neilellis 23 hours ago
Yes, also for semantic indexes, I use one for person/role/org matches. So that CEO == chief executive ~= managing director good when you have grey data and multiple look up data sources that use different terms.
replyesafak 24 hours ago
You could assign the cluster based on what the k nearest neighbors are, if there is a clear majority. The quality will depend on the suitability of your embeddings.
replyOutOfHere 24 hours ago
It altogether depends on the quality and suitability of the provided embedding vector that you provide. Even with a long embedding vector using a recent model, my estimation is that the classification will be better than random but not too accurate. You would typically do better by asking a large model directly for a classification. The good thing is that it is often easy to create a small human labeled dataset and estimate the error confusion matrix via each approach.
replydmezzetti 9 hours ago
Very interesting!
replyIt would be great to see how it compares to Faiss / HNSWLib etc. I'd will consider integrating it into txtai as an ANN backend.
wittlesus 19 hours ago
[dead]
replyyawnxyz 18 hours ago
useful for adding semantic search to tiny bits of data, e.g. collections of research papers in a folder on my computer, etc.
replyfor web stuff, e.g. community/forums/docs/small sites which usually don't even have 1M rows of data, precomputing embeddings and storing them and running on a small vector search like this somewhere is much simpler/cheaper than running external services
it's the operational hassle of not having to deal with a dozen+ external services, logins, apis, even if they're free
(I do like mixed bread for that, but I'd prefer it to be on my own lightweight server or serverless deployment)
dev_l1x_be 15 hours ago
i think the question is really: can I turn my search problem into a in-process vector search problem where I can scale with the number of processes.
reply
I'd love to see those results independently verified, and I'd also love a good explanation of how they're getting such great performance.
Typically, the recipe is to keep the hot parts of the data structure in SRAM in CPU caches and a lot of SIMD. At the time of those measurements, USearch used ~100 custom kernels for different data types, similarity metrics, and hardware platforms. The upcoming release of the underlying SimSIMD micro-kernels project will push this number beyond 1000. So we should be able to squeeze a lot more performance later this year.
That said, self-reported numbers only go so far—it'd be great to see USearch in more third-party benchmarks like VectorDBBench or ANN-Benchmarks. Those would make for a much more interesting comparison!
On the technical side, USearch has some impressive work, and you're right that SIMD and cache optimization are well-established techniques (definitely part of our toolbox too). Curious about your setup though—vector search has a pretty uniform compute pattern, so while 100+ custom kernels are great for adapting to different hardware (something we're also pursuing), I suspect most of the gain usually comes from a core set of techniques, especially when you're optimizing for peak QPS on a given machine and index type. Looking forward to seeing what your upcoming release brings!
And we always welcome independent verification—if you have any questions or want to discuss the results, feel free to reach out via GitHub Issues or our Discord.
You're absolutely right that a basic HNSW implementation is relatively straightforward. But achieving this level of performance required going beyond the usual techniques.
A better comparison would be with Meta's FAISS
This sort of behaviour is now absolutely rampant in the AI industry.