CrabTrap: An LLM-as-a-judge HTTP proxy to secure agents in production
53 points by pedrofranceschi 9 hours ago | 8 comments
DANmode 38 minutes ago
We’re supposed to be fixing LLM security by adding a non-LLM layer to it,
replynot adding LLM layers to stuff to make them inherently less secure.
This will be a neat concept for the types of tools that come after the present iteration of LLMs.
Unless I’m sorely mistaken.
snug 32 minutes ago
I think this can be great as additional layer of security. Where you can have a non llm layer do some analysis with some static rules and then if something might seem phishy run it through the llm judge so that you don’t have to run every request through it, which would be very expensive.
replyEdit: actually looks like it has two policy engines embedded
windexh8er 22 minutes ago
And we don't think the judge can/will be gamed? Also... It's an LLM, it's going to add delay and additional token burn. One subjective black box protecting another subjective black box. I mean, what couldn't go wrong?
replyImPostingOnHN 22 minutes ago
What happens when a prompt injection attack exploits the judge LLM and results in a higher level of attacker control than if it never existed?
replyreassess_blind 34 minutes ago
It looks as if this tool has traditional static rules to allow/deny requests, as well as a secondary LLM-as-a-judge layer for, I imagine, the kinds of rules that would be messy or too convoluted to implement using standard rules.
replySkyPuncher 36 minutes ago
Defense in depth. Layers don't inherently make something less secure. Often, they make it more secure.
replyyakkomajuri 32 minutes ago
I do think this is likely to make things more secure but it's also dangerous by potentially giving users a false sense of complete security when the security layer is probabilistic rather than deterministic.
replyEDIT: it does seem to have a deterministic layer too and I think that's great
I think you're spot on with the fact that it's so far it's been either all or nothing. You either give an agent a lot of access and it's really powerful but proportionally dangerous or you lock it down so much that it's no longer useful.
I like a lot of the ideas you show here, but I also worry that LLM-as-a-judge is fundamentally a probabilistic guardrail that is inherently limited. How do you see this? It feels dangerous to rely on a security system that's not based on hard limitations but rather probabilities?