Show HN: CLI tool for detecting non-exact code duplication with embedding models
18 points by rkochanowski 2 hours ago | 5 comments

rkochanowski 2 hours ago
I built Slopo to solve one specific problem: finding similar code that is hardest to detect by other tools, coding AI agents, and humans.

It finds similar-looking code with embeddings. This detects more than just copy-paste clones or even clones with minor changes. Similar code is often not a clone to refactor, and this is a trade-off. Initial results need to be verified, but coding agents can do this quickly. Example prompts are available on https://slopo.dev

Additionally, similar code distant in the codebase is ranked higher to focus on less obvious duplication.

The results differ a lot depending on the codebase. I noticed that sometimes most of the detected duplicates are false positives, but the remaining ones are strong candidates to refactor or even bugs. Sometimes it reveals much more real duplication.

reply
realxrobau 2 hours ago
If it did PHP I would love to run it over WordPress. What would it take to add that?
reply
rkochanowski 42 minutes ago
PHP support can be easily added, I will release a new version soon.
reply
raro11 14 minutes ago
Thank you
reply
murats 30 minutes ago
Nice idea. I can see this being useful before refactors, especially when the duplication is semantic rather than copy paste.
reply
NYCHMPAI 32 minutes ago
[flagged]
reply