Show HN: I built a tool to un-dumb Claude Code's CLI output (Local Log Viewer)
68 points by matt1398 5 days ago | 44 comments
Hi HN,

I built this because I got tired of the Claude Code CLI hiding details from me.

Recent updates have replaced critical output with summaries like "Read 3 files" or "Edited 2 files". To see what actually happened, I was forced to use `--verbose`, which floods the terminal with unreadable JSON and system prompts.

I wanted a middle ground: *Full observability without the noise.*

`claude-devtools` is a local Electron app that tails the session logs in `~/.claude/` to reconstruct the execution trace in real-time.

*Unlike wrappers, it solves the visibility gap in your native terminal workflow:* 1. *Real Diffs:* It shows inline diffs (red/green) the moment files are edited, instead of just a checkmark. 2. *Context Forensics:* It breaks down token usage by File vs Tool Output vs Thinking (so you know exactly why your context window is full). 3. *Agent Trees:* It visualizes sub-agent execution paths which are usually interleaved and confusing in the CLI.

It’s 100% local, and works with the logs already on your machine. No API keys required.

Repo: https://github.com/matt1398/claude-devtools (Screenshots and diff viewer demo are in the README)


miroljub 15 hours ago
Why not just use something sane by default? It's not like there are not better alternatives to Claude Code.

Opencode is great as a full replacement. Works out of the box.

Pi code agent[1] is even better, if you spend some time in it, you can customize is to your liking. It's a vi and Emacs combined for agents.

[1] https://pi.dev

reply
azuanrb 15 hours ago
Based on their Terms of Service, we are not allowed to use a Claude Code subscription outside of Claude Code itself. Although it may work with tools like pi or other harnesses, doing so puts your account at risk of being banned.

Claude Code is not developer friendly.

reply
quikoa 14 hours ago
Given how Anthropic changes Claude Code and forces everyone with a subscription to use it (if you pay much more with the API you can avoid it). I think it's fair to say that Claude Code is not an asset but a liability.
reply
eurekin 15 hours ago
OpenCode works nicely, I wish it web mode would be developed more. Currently, as it stands, you have to work on the same host in order to pass the full OAuth login flow (it redirects to localhost) for subscription based providers (Claude, ChatGPT). I wish it used some BASE_URL variable I could set, so it would be used instead.
reply
kzahel 15 hours ago
I think you're not supposed to use your claude max plans etc with external harnesses. In theory your account could get banned.
reply
mentalgear 15 hours ago
Are there any good benchmarks comparing CC with openCode and Pi ?
reply
syabro 15 hours ago
Nothing can compare for me with $200 Max CC subscription
reply
zontyp 15 hours ago
i had the exact same problem as op , CC showed razzmatazing , analyzing and other B.S i was bit irked. then i tried pi agent after @mitsuhiko suggested it on YT , X and this issue went away. now pi tells me or shows me the entire action that the model takes. i would still want the actions to be summarized or the harness explain me a concept that i missed (evident from my context) - but thats probably too far fetched.
reply
bjt12345 15 hours ago
Sometimes it states "Gittifying...", the first time it happened to me I was smashing the Ctrl+C keys in panic.
reply
kzahel 14 hours ago
When I first started using Claude I was pretty annoyed by the cute phrases. But when I built my own wrapper I started using them because I had gotten used to them. But added in a setting of course: [Fun Phrases Show playful status messages while waiting for responses. Disable to show only "Thinking..."]

Why Anthropic can't provide such a setting, we will never know!

reply
JimDabell 13 hours ago
It’s ridiculous, but they have recently introduced a feature to change the list of messages, so you can just set it to Working or similar now:

    "spinnerVerbs": {
      "mode": "replace",
      "verbs": ["Working"]
    }
https://github.com/anthropics/claude-code/issues/6814#issuec...
reply
miroljub 13 hours ago
Anthropic is mocking developers here :)

They bothered to let you override default nonsense messages with custom messages, but they still don't want to let you see what you are actually interested in. The actual information is kept hidden.

Luckily, new open weight models from China caught up with Anthropic, so you can use a sane harness with a much cheaper subscription and never look back.

reply
benreesman 15 hours ago
You're on the path. Keep going.

https://www.youtube.com/watch?v=9ZLgn4G3-vQ

reply
cjonas 15 hours ago
All these coding agents should support custom otel endpoints with full instrumentation (http, mcp, file system access, etc).
reply
matt1398 15 hours ago
True. They actually do support basic OTel now, but it's mostly limited to high-level metrics like token usage and session counts. Until then, parsing the local files seem to be pretty much the only way to get real observability.
reply
khoury 15 hours ago
Does it also show usage? I think it's pretty ridiculous we have to install 3rd party packages/implement it ourselfs just to see how much gas is left in the tank basically. Or constantly check the usage tab on the web, but still.
reply
matt1398 15 hours ago
To clarify on what the others mentioned: `/usage` and `/status` in the CLI do give you basic session token counts.

But regarding khoury's original point about the actual "gas in the tank" (billing/account balance)—no, my tool doesn't show that either.

Since `claude-devtools` strictly parses your local `~/.claude/` logs and makes zero network calls, it doesn't have access to your Anthropic account to pull your actual dollar balance.

What it does provide is high-resolution context usage. Instead of just a total session count, it breaks down tokens per-turn (e.g., how many tokens were eaten by reading a specific file vs. the tool output). It helps you manage your context window locally, but for billing, you're unfortunately still stuck checking the web dashboard.

reply
khoury 15 hours ago
Thx for the clarification :)
reply
itsmevictor 15 hours ago
Why not just /usage?
reply
virtualritz 15 hours ago
/status?
reply
eurekin 15 hours ago
Oh, wow. I was debugging the same in copilot (the only "work approved" agent) in Intellij, which showed that copilot didn't return commands output at all. I wrote a comment under relevant issue, if you're curious.

I think there are quite a few bugs lingering in those agent-cli's and observability, would help a lot with reporting. Taking yours for a spin this evening, thank you!

reply
matt1398 15 hours ago
Yeah, debugging swallowed command outputs is definitely a pain.

Thanks for giving it a spin tonight! Let me know if you run into any issues.

reply
joelschw 13 hours ago
reply
jimmySixDOF 14 hours ago
there was also this last year not updated in a while but the view handling is better than native verbose :

"claude-trace" Record all your interactions with Claude Code as you develop your projects. See everything Claude hides: system prompts, tool outputs, and raw API data in an intuitive web interface.

https://github.com/badlogic/lemmy/tree/main/apps/claude-trac...

reply
gregoriol 15 hours ago
I checked the repository and it has >20 config files, just for a simple tool, something is terribly in todays development
reply
matt1398 15 hours ago
Fair point. The root directory can be seen noisy right now. There are two main reasons for that:

1. Cross-platform distribution: Shipping an Electron app across macOS (ARM/Intel), Linux (AppImage/deb/rpm), Windows, and maintaining a standalone Docker/Node server just requires a lot of platform-specific build configs and overrides (especially for electron-builder).

2. Agentic coding guardrails: As I built most of this project using Claude Code itself, I wanted strict boundaries when it writes code

The ESLint, Prettier, strict TS, Knip (dead code detection), and Vitest configs act as quality gates. They are what keep the AI's output from drifting into unmaintainable spaghetti code. Without those strict constraints, agentic coding falls apart fast.

I'd rather have 20 config files enforcing quality than a clean root directory with an AI running wild. That said, I totally take your point—I should probably consolidate some of these into package.json to clean things up.

reply
embedding-shape 15 hours ago
> They are what keep the AI's output from drifting into unmaintainable spaghetti code. Without those strict constraints, agentic coding falls apart fast.

Which ones, ESLint and Prettier and so on? Those are just for "nice syntax", and doesn't actually help your agent with what they actually fall over themselves with, which is about the design of your software, not what specific syntax they use.

reply
matt1398 15 hours ago
To be clear, I'm not saying they solve high-level software design.

The goal is to prevent the agent from getting derailed by basic noise. Forcing it to deal with strict TS errors, dead code (Knip), or broken formatting in the feedback loop keeps the context clean.

It’s less about architecting the app and more about giving the agent immediate stderr signals so it stays on the rails.

reply
embedding-shape 14 hours ago
> they solve high-level software design

That's not what I was getting at either, but the design is pervasive in your program, not just something that sits as a document on top, but codified in the actual program.

> The goal is to prevent the agent from getting derailed by basic noise

Ah, I see. Personally I haven't seen agents getting slower and less precise of that, but I guess if that's the issue you're seeing, then it makes sense to try to address that.

Out of curiosity, what model/tooling are you using, only Claude Code? I've mostly been using Codex as of late, and it tends to deal with those things pretty easily, while none of the agents seems to be able to survive longer on their own without adding up too much technical debt too quickly. But maybe that's at another lifecycle than where you are getting stuck currently.

reply
osener 15 hours ago
This is some next level nitpicking. It's like criticizing XCode or Idea config of someone, instead of their product (or more popularly whether their website hijacks the back button). But at least in this case the dev config is checked in and reproducible.
reply
gregoriol 13 hours ago
It says something about the dev
reply
6LLvveMx2koXfwn 15 hours ago
Won't this break every time the log format changes?
reply
matt1398 15 hours ago
I actually had the exact same worry when I started building this.

But it turns out Claude Code's official VS Code extension is built to read these exact same local `.jsonl` files. So unless Anthropic decides to intentionally break their own first-party extension, it should remain relatively stable.

Of course, they will add new payload types (like the recent "Teams" update), but when that happens, it's pretty trivial to just add a new parser handler for it—which I've already been doing as they update the CLI.

So far, it's been surprisingly easy to maintain!

reply
kzahel 15 hours ago
Yeah this is a risk, the jsonl format is not a documented api surface.

I have a similar project that started out as just a log viewer but is now a full session manager etc (https://github.com/kzahel/yepanywhere). My approach was to occasionally run zod schema validations against all my local sessions to make sure the frontend has a pretty faithful schema. I've noticed sometimes when I run claude cli it modifies some jsonl files, it might be doing some kind of cleanup or migration, I haven't looked too deeply into it (yepanywhere detects when files change so I see those sessions as "unread, externally tracked")

reply
iamleppert 9 hours ago
I don't want a different interface than the CLI. I just want the exact same interface as the CLI, but with infinite scroll.
reply
KingMob 15 hours ago
For those who don't want to install yet another wrapper, you can just use the `--verbose` flag: https://code.claude.com/docs/en/cli-reference#cli-flags
reply
matt1398 14 hours ago
The problem with `--verbose` is that it floods the terminal, making real-time debugging a headache.

Also, this isn't a wrapper—it’s a passive viewer. I built it specifically to keep the native terminal workflow intact.

It’s especially useful when you're running multiple parallel sessions. Have you ever tried digging through raw JSON logs to retroactively debug passed sessions at once, since the session is already shut down? It’s nearly impossible without a proper UI. This tool is for those "post-mortem" moments, not just for watching a single stream.

reply
xyzsparetimexyz 14 hours ago
the emdashes lmfao
reply
igravious 14 hours ago
If it hasn't been said before, Anthropic should hire this person.
reply
small_model 15 hours ago
Cant you just look at the diffs? Not sure the point of using Claude and having to babysit every change it makes, kind of defeats the purpose. Like would you sit watching a Junior devs every keystroke.
reply
matt1398 15 hours ago
I don't sit there watching every session either—that's definitely not the point.

It's more like standard observability. You don't watch your server logs all day, but when an error spikes, you need deep tracing to find out why.

I use this when the agent gets stuck on a simple task or the context window fills up way faster than expected. The tool lets me "drill down" into the tool outputs and execution tree to see exactly where the bad loop started.

If you're running multiple parallel sessions across different terminal tabs, trying to grep through raw logs to find a specific failure is a massive productivity sink. This is for when things go sideways and you need to solve it in seconds, not for babysitting every keystroke.

reply
small_model 14 hours ago
Fair enough, I use planning mode a lot so it will explain which files it's going to change and why, before running, never had an issue as long as I can that and it looks sensible and testing works. But I am probably not the type of user you are targeting.
reply
Grimblewald 15 hours ago
Depends on the work you're doing. Cookie cutter / derivative work like I do for some hobby projects? Sure, it can near full auto it. More abstract or cutting edge stuff like in academic research enviornments? It needs correction at just about every step. Your workflow sounds like it deals with the former, which is fine, but that isn't everyone.
reply
UqWBcuFx6NV4r 15 hours ago
Nobody said every keystroke. That’s not like for like.
reply