- Hybrid recall + reranker: Two searches merged, then re-scored for best matches
- Supersession: Old facts get hidden, new ones take their place
- Decay: Recent or often‑used memories get a score boost
- DLS: Each user only sees their own documents
It seems like a cool approach. Don't know if it's novel but it's much smarter than "shove markdown files into directories".
Is it, though? I mean, is there evidence "bunch of markdown files" is bad while "database the model has to be instructed how to use" is good? `rg` is fast as hell. Markdown is the LLMs native tongue. It does require maintenance of the Markdown files to keep them current, but maybe explicit management is fine. The models can do the grunt work.
BMDF (Bunch of Markdown Files) can be checked into the git repo, they travel to any developer on the project without any setup or special auth, any agent and any model can read them with no special tools to install, and humans can easily poke around and read them, too. And, they can be part of the PR review process, documenting the code and intentions.
I can't come up with good arguments for why a database or search index would be better than documentation in Markddown for any of my projects.
tl;dr https://www.elastic.co/search-labs/blog/agent-memory-elastic...
It took me a while to wrap my head around the two terms since they seem similar -- but Accuracy is basically "did i get mostly good results" and Recall is "did I get most of the good results" and they're subtly different. :)
Those two terms, though, will unlock as deep a rabbit-hole as you'd like on the subject.
"Good" is subjective.
I think the challenge is to teach how ranking works to people more effectively so that they can build it for themselves and host them on their own.
Like the other day someone who has worked in search explained to me why you would care about using learning-to-rank(LTR) technique to train your own feature vector weights on your data. My understanding is that weighted features work better(retreival wise) on textual data than plain BM-25 and vector embedding db indexing of text chunks of your data with minimal preprocessing. So if you have lots of conversations you can create a ton of features(like attributes of a conversation) from it and ones that matter more will rank higher. And you can use a regularization(like L1) to kill unimportant ones.
[EDIT]: IIUC, I think LTR is important because you likely want different features to matter more for different parts of your documents, e.g. what matters for codebase documentation is different from your personal journal.
The point about memory is sometimes you remember great detail, sometimes you only remember that the memory exists, so having a good tool loop to attempt to recall and try permutations is good.
This seems to be coming from the “we must make ElasticSearch AI-compatible” department more than anything.
Saying, “just use SQLite” completely dismisses the idea that this is a _shared_ memory across teams. The ability to easily connect to the remote service and have everything “just work” pays dividends when you have dozens or hundreds of users.
> This seems to be coming from the “we must make ElasticSearch AI-compatible” department more than anything.
I don't see the problem in that. It'd be great to have agentic capabilities embedded into Kibana and ES as long as it's not user hostile.
Maintaining Elasticsearch isn't free, but picking an underpowered db and having to port to the right one is also quite time consuming.
also, I've run ES on an old laptop and it worked really well, so the cost of it can be pretty low if you're still in development