Anthropic Cowork feature creates 10GB VM bundle on macOS without warning
348 points by mystcb 10 hours ago | 176 comments

felixrieseberg 8 hours ago
Hi, Felix from Anthropic here. I work on Claude Cowork and Claude Code.

Claude Cowork uses the Claude Code agent harness running inside a Linux VM (with additional sandboxing, network controls, and filesystem mounts). We run that through Apple's virtualization framework or Microsoft's Host Compute System. This buys us three things we like a lot:

(1) A computer for Claude to write software in, because so many user problems can be solved really well by first writing custom-tailored scripts against whatever task you throw at it. We'd like that computer to not be _your_ computer so that Claude is free to configure it in the moment.

(2) Hard guarantees at the boundary: Other sandboxing solutions exist, but for a few reasons, none of them satisfy as much and allow us to make similarly sound guarantees about what Claude will be able to do and not to.

(3) As a product of 1+2, more safety for non-technical users. If you're reading this, you're probably equipped to evaluate whether or not a particular script or command is safe to run - but most humans aren't, and even the ones who are so often experience "approval fatigue". Not having to ask for approval is valuable.

It's a real trade-off though and I'm thankful for any feedback, including this one. We're reading all the comments and have some ideas on how to maybe make this better - for people who don't want to use Cowork at all, who don't want it inside a VM, or who just want a little bit more control. Thank you!

reply
baconner 8 hours ago
FWIW I think many of us would actually very much love to have an official (or semi official) Claude sandboxing container image base / vm base. I wonder if you all have considered making something like the cowork vm available for that?
reply
hedgehog 7 hours ago
There is this:

https://code.claude.com/docs/en/devcontainer

It does work but I found pretty quickly that I wanted to base my robot sandbox on an image tailored for the project and not the other way around.

reply
baconner 2 hours ago
Ok I'd seen some sample sandbox scripts for this from anthropic before but not a full reference container. nice, thank you for sharing.
reply
beklein 7 hours ago
Perhaps useful, I discovered: https://github.com/agent-infra/sandbox

> All-in-One Sandbox for AI Agents that combines Browser, Shell, File, MCP and VSCode Server in a single Docker container.

reply
swyx 7 hours ago
what would you use it for?
reply
baconner 2 hours ago
What the other poster here said for testing against a reference, but also as an easier to get started with base for my own coding sandbox with coding agents. Took me quite a while to build one on my own that I was semi-happy with but I'd imagine one solid enough to run cowork on safely might have some deeper thinking and review behind it.
reply
skoocda 7 hours ago
Not OP, but having the exact VM spec your agent runs on is useful for testing. I want to make sure my code works perfectly on any ephemeral environments an agent uses for tasks, because otherwise the agent might invent some sort of degenerate build and then review against that. Seen it happen many times on Codex web.
reply
beej71 8 hours ago
I think these are are excellent points, but the complaint talks about significant performance and power issues.
reply
wutwutwat 7 hours ago
That's every virtual machine that's ever existed. They are slower than metal and you're running two OS stacks so you'll draw more power.
reply
bobmcnamara 3 hours ago
Hot forking would be a killer app here - far faster to clone a VM, screw it up, burn it down, and repeat than anything else
reply
binsquare 5 hours ago
Not every virtual machine, try microVMs.

I am building one now that works locally. But back in the day, I saw how extremely efficient VMs can be at AWS. microVMs power lambda btw

reply
blcknight 5 hours ago
I would look at how podman for Mac manages this; it is more transparent about what's happening and why it needs a VM. It also lets you control more about how the VM is executed.
reply
quinncom 7 hours ago
I accidentally clicked the Claude Cowork button inside the Claude desktop app. I never used it. I didn't notice anything at the time, but a week later I discovered the huge VM file on my disk.

It would be really nice to ask the user, “Are you sure you want to use Cowork, it will download and install a huge VM on your disk.”

reply
divan 4 hours ago
Same. I work on M3 Pro with 512GB disk, and most of the time I have aroung 50GB free that goes down to 1GB often quite quick (I work with video editing and photos and caches are agressive there). I use apps like Pretty Clean and some own scripts (for brew clean, deleting Flutter builds, etc). So every 10GB used is a big deal for me.

Also discovered that VM image eating 10GB for no reason. I have Claude Desktop installed, but almost never use it (mostly Claude Code).

reply
ephou7 6 hours ago
Jesus Christ what kind of potatos are you using when 10 GB of disk space are even noticable for you?
reply
quinncom 4 hours ago
If I had been tethering to mobile hotspot at the time it would have instantly used 500 pesos of data. That’s 3x my monthly electric bill.
reply
aberoham 5 hours ago
Claude Cowork grabs local DNS resolution on macOS which conflicts with secure web gateway aka ZTNA aka SASE products such as Cloudflare Warp which do similar. The work-around is to close Cowork, let Warp grab mDNSResponder's attention first, then restart Claude Desktop, or some similar special ordering sequence. It's annoying, but you could say that about everything having to do with MITM middleboxes.
reply
radicality 7 hours ago
I tried to use it right after launch from within Claude Desktop, on a Mac VM running within UTM, and got cryptoc messages about Apple virtualization framework.

That made me realize it wants to also run a Apple virtualization VM but can’t since it’s inside one already - imo the error messaging here could be better, or considering that it already is in a VM, it could perhaps bypass the vm altogether. Because right now I still never got to try cowork because of this error.

reply
lxgr 7 hours ago
Does UTM/Apple's framework not allow nested virtualization? If I remember correctly from x86(_64) times, this is a thing that sometimes needs to be manually enabled.
reply
thomascountz 6 hours ago
I've come across two different answers regarding Apple's Virtualization.Framework support for nested virtualization:

1. Yes, but only Linux guests 2. Yes, but only M3+

https://github.com/apple/container/issues/191

reply
PrairieFire 3 hours ago
You are correct on both accounts, as of tahoe 26.3 you can't nest a macOS guest under a macOS guest. However you can nest 2 layers deep with any combo of layer 1 guest so long as the machine is running Sequoia and is M3/M4/M5.
reply
exabrial 6 hours ago
Felix, is there any way you guys could fix this simple, but absolutely terribly annoying bug?

Claude mangles XML files with <name> as an XML Tag to <n>

https://news.ycombinator.com/item?id=47113548

reply
ukuina 7 hours ago
Can you allow placing the VM on an external disk?

Also, please allow Cowork to work on directories outside the homedir!

reply
lxgr 7 hours ago
I suppose you could just symlink the directory it's in?
reply
bachittle 7 hours ago
Do you think it would be possible in the future to maybe add developer settings to enable or disable certain features, or to switch to other sandboxing methods that are more lightweight like Apple seatbelt for example?
reply
flatline 7 hours ago
There's a lot that's not being said in (2). That warrants more extensive justification, especially with the issues presented in the parent post.
reply
Someone1234 7 hours ago
They're using the harnesses provided by the respective underlying Operating Systems to do virtualization.

I'd like to explore that topic more too, but I feel like the context of "we deferred to MacOS/Windows" is highly relevant context here. I'd even argue that should be the default position and that "extensive justification" is required to NOT do that.

reply
tyfon 6 hours ago
It would be really nice to have an option to not do this since a ton of companies deny VMs in their group policies.
reply
Terretta 6 hours ago
To a firm with such policies, to allow Cowork outside the VM should be strictly worse.

Ironically, VMs are typically blocked because the infosec team isn't sure how to look inside them and watch you, unlike containers where whatever's running is right there in the `ps` list.

They don't look inside the JVM or .exes either, but they don't think about that the same way. If they treat an app like an exe like a VM, and the VM is as bounded as an app or an exe, with what's inside staying inside, they can get over concerns. (If not, build them a VM with their sensors inside it as well, and move on.)

This conversation can take a while, and several packs of whiteboard markers.

reply
lrakster 6 hours ago
Agreed. Need to make this a choice for us.
reply
Terretta 6 hours ago
> real trade-off … thankful for any feedback

Speaking as a tiny but regulated SMB that's dabbling in skill plugins with Cowork: we strongly appreciate and support this stance. We hope you don't relax your standards, and need you not to. We strongly agree with (1), (2), and (3).

If working outside the sandbox becomes available, Cowork becomes a more interesting exfil vector. A vbox should also be able to be made non-optional — even if MDM allows users to elevate privileges.

We've noticed you're making other interesting infosec tradeoffs too. Your M365 connector aggressively avoids enumeration, which we figured was intentional as a seatbelt for keeping looky-loos in their lane.* Caring about foot-guns goes a long way in giving a sense of you being responsible. Makes it feel less irresponsible to wade in.

In the 'thankful for feedback' spirit, here's a concrete UX gap: we agree approval fatigue matters, and we appreciate your team working to minimize prompts.

But the converse is, when a user rejects a prompt — or it ends up behind a window — there's no clear way to re-trigger. Claude app can silently fail or run forever when it can't spin up the workspace, wasn't allowed to install Python, or was told it can't read M365 data.

Employees who've paid attention to their cyber training (reasonably!) click "No" and then they're stuck without diagnostics or breadcrumbs.

For a CLI example of this done well, see `m365-cli`'s `auth` and `doctor` commands. The tool supports both interactive and script modes through config (backed by a setup wizard):

https://pnp.github.io/cli-microsoft365/cmd/cli/cli-doctor/

Similarly, first party MCPs may run but be invisible to Cowork. Show it its own logs and it says "OK, yes, that works but I still can't see it, maybe just copy and paste your context for now." A doctor tool could send the user to a help page or tell them how to reinstall.

Minimal diagnostics for managed machines — running without local admin but able to be elevated if needed — would go a long way for the SMBs that want to deploy this responsibly.

Maybe a resync perms button or Settings or Help Menu item that calls cowork's own doctor cli when invoked?

---

* When given IDs, the connector can read anything the user can anyway. We're able to do everything we need, just had to ship ID signposts in our skill plugin that taps your connector. Preferred that hack over a third party MCP or CLI, thanks to the responsibility you look to be iteratively improving.

reply
rvz 6 hours ago
> (2) Hard guarantees at the boundary: Other sandboxing solutions exist, but for a few reasons, none of them satisfy as much and allow us to make similarly sound guarantees about what Claude will be able to do and not to.

This is the most interesting requirement.

So all the sandbox solutions that were recently developed all over GitHub, fell short of your expectations?

This is half surprising since many people were using AI to solve the sandboxing issue have claimed to have done so over several months and the best we have is Apple containers.

What were the few reasons? Surely there has to be some strict requirement for that everyone else is missing.

But still having a 10 GB claude.vmbundle doesn't make any sense.

reply
xvector 8 hours ago
Cowork has been an insane productivity boost, it is actually amazing. Thank you!
reply
jccx70 8 hours ago
[dead]
reply
consumer451 6 hours ago
Any chance you guys could get the Claude Desktop installer fixed on Windows? It currently requires users to turn on "developer mode."

Sorry for the ask here, but unaware of other avenues of support as the tickets on the Claude Code repo keep getting closed, as it is not a CC issue.

https://github.com/anthropics/claude-code/issues/26457https:...

reply
consumer451 11 minutes ago
Non double pasted link. This is effecting a lot of people.

https://github.com/anthropics/claude-code/issues/26457

reply
MarleTangible 9 hours ago
It's incredible how many applications abuse disk access.

In a similar fashion, Apple Podcasts app decided to download 120GB of podcasts for random reason and never deleted them. It even showed up as "System Data" and made me look for external drive solutions.

reply
kace91 9 hours ago
The system data issue on macOS is awful.

I use my MacBook for a mix of dev work and music production and between docker, music libraries, update caches and the like it’s not weird for me to have to go for a fresh install once every year or two.

Once that gets filled up, it’s pretty much impossible to understand where the giant block of memory is.

reply
prmph 9 hours ago
Yep, it is an awful situation. I'm increasingly becoming frustrated with how Apple keeps disrespecting users.

I downloaded several MacOS installers, not for the MacBook I use, but intending to use them to create a partitioned USB installer (they were for macOS versions that I could clearly not even use for my current MacBook). Then, after creating the USB, since I was short of space, I deleted the installers, including from the trash.

Weirdly, I did not reclaim any space; I wondered why. After scratching my head for a while, I asked an LLM, which directed me to check the system snapshots. I had previously disabled time machine backup and snapshots, and yet I saw these huge system snapshots containing the files I had deleted, and kicker was, there was no way to delete them!

Again I scratched my head for a while for a solution other than wiping the MacBook and re-installing MacOS, and then I had the idea to just restart. Lo and behold, the snapshots were gone after restarting. I was relieved, but also pretty pissed off at Apple.

reply
ryandrake 7 hours ago
It's just as bas on Windows. Operating Systems and Applications have been using the user's hard drive as a trash dumping ground for decades. Temporary files, logs, caches, caches of caches, settings files, metadata files (desktop.ini, .fseventsd, .Trashes, .Spotlight-V100, .DS_Store). Developers just dump their shit all over your disk as if it belongs to them. I really think apps should have to ask permission before they can write to files, outside of direct user-initiated command.
reply
intrasight 7 hours ago
I can't help but think back to a conversation with my girlfriend in 1984. She had just bought a PC and I had bought a Mac.

She said "Oh, you bought a toy computer. How cute!"

I've owned every architecture of Mac since then, and I still think of it is my toy computer.

reply
jmalicki 8 hours ago
Disk utility lets you delete them.
reply
prmph 5 hours ago
Nope, I tried that, was blocked by SIP.
reply
vachina 9 hours ago
Because Apple differentiates their products by their storage sizes, they also sell iCloud subscription. There is zero (in fact negative) incentive to respect your storage space.
reply
threetonesun 8 hours ago
Been a while since I needed to use it there but it always amazed me that the Windows implementation of iCloud was more flexible in terms of location and ability to decide what files got synced.
reply
anonymars 7 hours ago
Ho ho, except for where it puts the photos. Those go into a subfolder of the system photos folder, and there's no configuration (yet you can configure the "shared photos" location)

And then, should you try to set up OneDrive (despite Microsoft's shenanigans, it does simplify taking care of non-tech-savvy relatives), it will refuse to sync the photos folder because 'it contains another cloud storage' and you'll genuinely wonder how or why anyone uses computers anymore

reply
dotxlem 9 hours ago
I had the same problem and had some luck cleaning things up by enabling "calculate all sizes" in Finder, which will show you the total directory size, and makes it a bit easier to look for where the big stuff is hiding. You'll also want to make sure to look through hidden directories like ~/Library; I found a bunch of Docker-related stuff in there which turned out to be where a lot of my disk space went.

You can enable "calculate all sizes" in Finder with Cmd+J. I think it only works in list view however.

reply
robin_reala 9 hours ago
I’d recommend GrandPerspective:[1] it’s really good at displaying this sort of thing, has been around for over two decades, and the developer has managed to keep it to <5MB which is perfect when you’re running very low on space.

[1] https://grandperspectiv.sourceforge.net/

reply
braingravy 8 hours ago
I use GP, would recommend as well; it generates great color codes tree maps of your storage. Once you get used to navigating it that way, you won’t go back.
reply
dewey 8 hours ago
Something like https://dev.yorhel.nl/ncdu with ("brew install ncdu") is great if you are okay with the command line. It's very annoying to drill down in the Finder especially if it's hidden directories.
reply
mrbombastic 8 hours ago
in a similar vein if you are looking for a nice GUI, daisydisk is great: https://daisydiskapp.com one time $10 payment
reply
prmph 8 hours ago
A ton of thanks. This "hack" allowed to finally see some stuff that was eating up a lot of my space and was showing up as "System Data". It turned out the Podman virtual machine on my MacBook had eaten up more 100GB!
reply
vintagedave 8 hours ago
Also DaisyDisk! Beautiful app. Perfect for discovering this kind of thing.
reply
1e1a 9 hours ago
You can also just use du -hs, eg. to show the size of all subdirectories under ~/Library/Caches/ do:

  du -hs ~/Library/Caches/*
reply
zarzavat 8 hours ago
The trick is to reboot into recovery partition, disable SIP, then run OmniDiskSweeper as root (as in `sudo /Applications/OmniDiskSweeper.app/Contents/MacOS/OmniDiskSweeper`). Then you can find all kinds of caches that are otherwise hidden by SIP.
reply
prmph 5 hours ago
It shouldn't be this hard to clear unwanted data from my own computer
reply
jasomill 3 hours ago
My immediate reaction to this is that the OS has a hard time establishing intent, and in some cases it probably should be this hard to delete data that's required for the system to boot on the grounds that you'd probably want it if you understood what it was, and ideally also hard for malware to delete data it doesn't want on your computer (forensically useful logs, backup copies of files encrypted by ransomware, etc.).

But none of this applies to caches and temporary files, which could be reasonably managed for 99% of users by adding a "clear all caches" checkbox in the reboot dialog with a warning that doing this is likely to slow down the system and increase battery usage for the next few hours, or to system-managed snapshots that mostly just need better UI and documentation.

UI transparency is my only real complaint. A reasonable amount of data the system wants to make difficult to delete is fine, so long as it clearly explains what it is and why. "System Data" is only acceptable as a description for the root of what should be a well-documented hierarchy.

reply
John23832 9 hours ago
Seconding.

I should not have to hack through /Libary files to regain data on a TB drive because Osx wanted to put 200gbs of crap there in an opaque manner and not give the user ANY direct way to regain their space.

reply
piyh 8 hours ago
Even worse on ipad. My wife is an artist and 100gigs of "system data" is completely inscrutable and there's zero ways to fix it besides a full wipe.
reply
millerm 7 hours ago
I simply run GrandPerspective (GUI app, https://grandperspectiv.sourceforge.net/), or dust (terminal app, https://github.com/bootandy/dust), to give me an idea of what is going on with disk usage.
reply
rickmatt 6 hours ago
Thank you for this! I just downloaded it and identified over 50G of junk. It's just what I have been looking for to help manage my drive utilization.
reply
pdntspa 8 hours ago
Equally egregious are applications that insist on using the primary disk to cache model data/sample data/whatever
reply
zbentley 8 hours ago
What should they do instead?

Like, assuming they need the data and it's inconveniently large to fit into RAM, where/how should they store and access it if not the primary disk?

reply
mock-possum 8 hours ago
They should ask. Let users specify a scratch / cache location - preferably fast storage that’s not The OS drive
reply
drumttocs8 8 hours ago
My 256gb Mac Mini currently has 65gb of "System Data" and 40gb of "MacOS"
reply
mbowcut2 8 hours ago
Gotta hit that docker system prune -a
reply
mschuster91 8 hours ago
> Once that gets filled up, it’s pretty much impossible to understand where the giant block of memory is.

Your friend is called ncdu and can be used as follows:

    sudo ncdu -x -e --exclude Volumes /System/Volumes/Data/
The exclude for Volumes is necessary because otherwise ncdu ends up in an infinite loop - "/Volumes/Macintosh\ HD/Volumes/" can be repeated ad nauseam and ncdu's -x flag doesn't catch that for whatever reason.
reply
dewey 9 hours ago
Don't run "du -h ~/Library/Messages" then, I've mentioned that many times before and it's crazy to me to think that Apple is just using up 100GB on my machine, just because I enable iMessage syncing and don't want to delete old conversations.

One would think that's a extremely common use case and it will only grow the more years iMessage exists. Just offload them to the cloud, charge me for it if you want but every other free message service that exists has no problem doing that.

reply
epistasis 7 hours ago

    sudo du -sh ~/Library/Messages
    Password:
    du: /Users/cvaske/Library/Messages: Operation not permitted
Wow, SIP is a bit more insidious than I remember. Maybe I should try it in Terminal.app rather than a third party app... I wonder if there will ever be a way to tell the OS "this action really was initiated by the user, not malware, let them try to do what they say they want to do"

Edit: investigating a bit more, apparently the lack of a sudo-equivalent, an "elevate this one process temporarily" command is intentional in the design, so that malicious scripts can't take advantage of that "this is really the user" approval path. I can't say I agree with that design decision, but at least it's an ethos.

reply
bee_rider 8 hours ago
Offloading to the cloud and charging the user seems like a bigger breach of expectations than the hard drive space.
reply
dewey 8 hours ago
If you have a choice there's nothing wrong with it. It's the same way that iCloud Photos already work. You can either disable iCloud and have everything locally in your Photos app or let it dynamically offload to iCloud (If you have enough cloud space).

I'd rather pay for cloud space that I'm already using anyway than having it take up my limited space on my laptop that I can't extend.

reply
dawnerd 8 hours ago
Same with photos. You can enable the option to offload but there’s no way to control how much is used locally. I don’t know why messages does that either. Also no easy way to remove the hundreds of thousands of photos in messages across all chats.
reply
mh- 7 hours ago
And for people like me who are content to pay for the iCloud storage in order to not delete them - there's no way to say "keep everything. but not locally, because that's silly."
reply
bensyverson 8 hours ago
Agreed, it should work like the iCloud Photos library; cache locally, but pull from the cloud when necessary.
reply
mh- 7 hours ago
Even with the way Photos work - which is desirable, I agree - I should be able to specify a limit on how much local disk it uses.

I don't know what the formula it uses is, but it's insufficient.

reply
bensyverson 5 hours ago
There is a workaround… You can create an APFS partition on your main drive, set it to a fixed size (e.g. 10GB), and then move the location of your Photos library to that drive.

Note that if your Photos library is already larger than you want it to be, you may need to make sure it's synced, delete it, and create a new library on the drive. It will then sync with iCloud. But that's a hassle, and I would back up the library before you do this.

reply
latexr 7 hours ago
System Settings > General > Storage. Click the ⓘ next to Messages. Sort by size and delete large attachments.
reply
dewey 7 hours ago
Appreciate the suggestion but that's similar to fixes like "Have you tried re-installing your OS, maybe that fixes the issue?".

I don't want to babysit my attachments or delete old conversations just because Apple doesn't put effort into that app. Probably my fault for still using it, but Telegram, WhatApp and Signal all manage to do it better.

reply
AndroTux 9 hours ago
This one drives me nuts. Not just on Mac, also on iPhone/iPad. It's 2026, and 5G is the killer feature advertised everywhere. There's no reason to default to downloading gigabytes of audio files if they could be streamed with no issue whatsoever.
reply
jlokier 7 hours ago
I'm on 5G right now and it just struggled to load the HN front page due to local network congestion. At times of day when it's not congested it reaches 60-90Mbyte/s in the same physical location

Spotify just gave up while trying to show me my podcasts. I can't listen to anything not already downloaded right now.

Yet at 3am I'll be able to download a 100GB LLM without difficulty onto the same device that can't stream a podcast right now.

Unfortunately I don't think 5G is the streaming panacea you have in mind. Maybe one day...

reply
melonpan7 7 hours ago
Only reason I still download from Apple Music to device is for lossless and hi-res lossless, which would otherwise use a lot of cellular data.
reply
frereubu 8 hours ago
On 5G, it depends. There are still plenty of people around the world who don't have unlimited data plans.
reply
AndroTux 2 hours ago
Then they can enable downloads in the settings. I’m not saying they should remove the feature. I’m saying setting this as a default on a non-budget device is a bad design choice.
reply
geooff_ 6 hours ago
I had the same problem but with a bad time machine backup. ~300GB of my 512GB disk, just labeled the generic "System Data". I lost a day of work over it because I couldn't do Xcode builds and had to do a deep dive into what was going on.
reply
coldtrait 9 hours ago
This seems to be a recent popular tool to handle this - https://github.com/tw93/Mole

I also prompt warp/gemini cli to identify unnecessary cache and similar data and delete them

reply
jacquesm 8 hours ago
> Apple Podcasts app decided to download 120GB

That's one way to drive sales for higher priced SSDs in Apple products. I'm pretty sure that that sort of move shows up as a real blip on Apple's books.

reply
jvidalv 7 hours ago
Suprisingly Claude is amazing at cleaning up your macbook. Tried, works like a charm.
reply
chuckadams 9 hours ago
Someone actually still uses the built-in podcasts app?
reply
hidelooktropic 9 hours ago
Not sure what you have against it. Works great for me. No subscription required. And if I do want to pay for ad free shows and support creators it's easy to do so.

Use whatever you like but I don't think Podcast app users are rare by any stretch of the imagination.

reply
Angostura 9 hours ago
It's absolutely fine, from what I can tell
reply
mister_mort 9 hours ago
AFAIK the native Podcast app for iPhone is the only way to make PC-phone podcast file syncing work. This stops you downloading the same podcast file twice, once on your PC and once on your phone.
reply
dewey 9 hours ago
It probably has more active users than all third party podcast apps on all mobile platforms combined. The power of defaults.
reply
rafram 9 hours ago
It's generally a good app. People in the tech community like Overcast, but I've always found its UI completely illogical. Apple Podcasts is organized like I'd expect a podcast app to be.
reply
delaminator 9 hours ago
My WinSxS folder is 17Gb
reply
blitzar 9 hours ago
The vibe coding giveth and the the vibe coding taketh away, blessed be the vibe coding
reply
zhyder 8 hours ago
I guess it could warn about it but the VM sandbox is the best part of Cowork. The sandbox itself is necessary to balance the power you get with generating code (that's hidden-to-user) with the security you need for non-technical users. I'd go even further and make user grant host filesystem access only to specific folders, and warn about anything with write access: can think of lots of easy-to-use UIs for this.
reply
Terretta 8 hours ago
Arguably, even without LLM, you too should be dev-ing inside a VM...

https://developer.hashicorp.com/vagrant is still a thing.

The market for Cowork is normals, getting to tap into a executive assistant who can code. Pros are running their consumer "claws" on a separate Mac Mini. Normals aren't going to do that, and offices aren't going to provision two machines to everyone.

The VM is an obvious answer for this early stage of scaled-up research into collaborative computing.

reply
messh 8 hours ago
Yeah, very easy to do today. May VPS providers help with this, checkout:

https://exe.dev

https://sprites.dev

https://shellbox.dev

reply
Terretta 6 hours ago
Yes! Whether VPS or local VM, this is a thing for good reasons.

Some reasons aren't even optional. Small but regulated entities exist, and most "Team" sized businesses aren't in Google apps or "the cloud" as they think about it, but are in M365, and do pay for cyber insurance.

Cowork with skills plugins that leverage Python or bash is a remarkably enabling framework given how straightforward it is. A skill engineer can sit with an individual contributor domain expert, conversationally decompose the expert's toil into skills and subcommands, iterate a few times, and like magic the IC gets hours back a day.

Cowork is Agents-On-Rails™ for LLM apps, like Rails was to PHP for web apps.

The VM makes that anti-fragile.

For any SaaS builders reading this: by far most white collar small business work is in Microsoft Office. The scarce "Continue with Microsoft" OIDC reaches more potential SMB desks than the ubiquitous "Continue with Google" and you don't have to learn the legacy SAML dance.

Anthropic seems to understand this. It's refreshing when a firm discovers how to cater to the 25–150 seat market. There's an uncanny valley between early adopters and enterprise contracts, but the world runs on SMBs.

Sign them all up!

reply
mihaelm 7 hours ago
I prefer devcontainers for more involved project setups as they keep it lighter than introducing a VM. It’s also pretty easy to work with Docker (on your host) with the docker-outside-of-docker feature.

However, I’m also curious about using NixOS for dev environments. I think there’s untapped potential there.

reply
Terretta 7 hours ago
we love nix for dev environments, and highly recommend it. many other problems go away. don't see that as what's being solved here, though.

containers contain stuff the way an open bookcase contains books, they're just namespaces and cgroups on a file system overlay, more or less, held together by willpower not boundaries:

https://jvns.ca/blog/2016/10/10/what-even-is-a-container/

https://github.com/p8952/bocker

as a firm required to care about infosec, we appreciate the stance in their (2). and MacOS VMs are so fast now, they might as well be containers except, you know, they work. (if not fast, that should be fixed.)

that said, yes, running local minikube and the like remain incredibly useful for mocking container envs where the whole environment is inside a machine(s) boundary. containers are _almost_ as awesome as bookcases…

reply
mihaelm 7 hours ago
I just went on a tangent related to dev environments i.e. inside what to develop. In case of Cowork, a VM is definitely the right choice - no doubt.
reply
sherburt3 3 hours ago
Do you wear a condom while you’re programming too for maximum protection?
reply
hirvi74 7 hours ago
I concur. I don't want to install libraries on my host machine that I won't use for anything other than development, e.g., Node.js.

On macOS, Lima has been a godsend. I have Claude Code in an image, and I just mount the directory I want the VM to have access to. It works flawlessly and has been a replacement for Vagrant for me for some time. Though, I owe a lot to Vagrant. It was a lifesaver for me back in the day.

reply
informal007 9 hours ago
I believe that employees in Anthropocs use CC to develop CC now.

AI really give much user ability to develop a completed product, but the quality is decreasing. Professional developers will be in demand when the products/features become popular.

First batch of users of new products need to take more responsibility to test the product like a rats in lab

reply
rvz 8 hours ago
> AI really give much user ability to develop a completed product, but the quality is decreasing. Professional developers will be in demand when the products/features become popular.

Looking at the amount of issues, outages and rookie mistakes the employees are making leads me to believe that most of them are below junior level.

If anyone were to re-interview everyone at Anthropic for their own roles with their own interview questions, I would guess that >75% of them would not pass their own interviews.

The only team the would pass them are the Bun team and some other of the recently acquired startups.

reply
sponnath 3 hours ago
While the whole "Claude Code is just like a game engine" tweet was silly, this comment seems too derisive. I highly doubt engineers at Anthropic are lacking in talent.
reply
linkregister 8 hours ago
Claude consistently tops the leaderboard in software engineering benchmarks.
reply
rvz 7 hours ago
You realise that excuse is completely irrelevant? For the outages and the rest of the issues above and even when it goes down you still need to know what exactly is wrong.

Using 'software engineering benchmarks' and 'leaderboards' to mask for those issues in scenarios that require rapid response or urgency doesn't make any sense and even going with that, I would expect less outages but it is in fact the opposite, especially when what we are seeing is that one outage occurrs, another one appears right afterwards almost the next day.

reply
bachittle 9 hours ago
Yup it uses Apple Virtualization framework for virtualization. It makes it so I can't use the Claude Cowork within my VMs and that's when I found out it was running a VM, because it caused a nested VM error. All it does is limit functionality, add extra space and cause lag. A better sandbox environment would be Apple seatbelt, which is what OpenAI uses, but even that isn't perfect: https://news.ycombinator.com/item?id=44283454
reply
ctmnt 7 hours ago
I don’t have an opinion on how they should handle the nested VMs probably, but I very much disagree that Seatbelt is better. Claude Code (aka `claude`) uses it, and it’s barely good for anything.

Out of curiosity, why are you running Cowork inside a VM in the first place? What does that get you that letting Cowork use its own VM wouldn’t?

reply
j16sdiz 9 hours ago
seatbelt is largely undocumented.
reply
bachittle 8 hours ago
OpenAI Codex CLI was able to use it effectively, so at least AI knows how to use it. Still, its deprecated and not maintained, Apple needs to make something new soon.
reply
pluc 9 hours ago
just ask AI to document it
reply
ramoz 8 hours ago
Not sure why you're getting down voted. This is totally reasonable.
reply
atonse 9 hours ago
I literally spent the last 30 mins with DaisyDisk cleaning up stuff in my laptop, I feel HN is reading my mind :)

I also noticed this 10GB VM from CoWork. And was also surprised at just how much space various things seem to use for no particular reason. There doesn't seem to be any sort of cleanup process in most apps that actually slims down their storage, judging by all the cruft.

Even Xcode. The command line tools installs and keeps around SDKs for a bunch of different OS's, even though I haven't launched Xcode in months. Or it keeps a copy of the iOS simulator even though I haven't launched one in over a year.

reply
cmckn 9 hours ago
> Xcode…keeps around SDKs for a bunch of different OS's

Not a new problem, unfortunately. DevCleaner is commonly used to keep it under control: https://github.com/vashpan/xcode-dev-cleaner

reply
hulitu 9 hours ago
Is there no crond and find on MacOSX ?
reply
creatonez 7 hours ago
As much as an inconvenience this may be, this is exactly what "agents" should be doing. If your tool doesn't have a builtin sandbox that is intended to be used at all times, you're using something downright hazardous and WILL end up suffering data loss.
reply
quanwinn 9 hours ago
I imagined someone at Anthropic prompted "improve app performance", and this was the result.
reply
pncnmnp 7 hours ago
On a similar tangent, but on the opposite end of the spectrum, check out this month-old discussion on HN: https://news.ycombinator.com/item?id=46772003

ChatGPT's code execution container contains 56 vCPUs!! Back then, simonw mentioned:

> It appears to have 4GB of RAM and 56 (!?) CPU cores https://chatgpt.com/share/6977e1f8-0f94-8006-9973-e9fab6d244...

I'm seeing something similar on a free account too: https://chatgpt.com/share/69a5bbc8-7110-8005-8622-682d5943dc...

On my paid account, I was able to verify this. I was also able to get a CPU-bound workload running on all cores. Interestingly, it was not able to fully saturate them, though - despite trying for 20-odd minutes. I asked it to test with stress-ng, but it looks like it had no outbound connectivity to install the tool: https://chatgpt.com/share/69a5c698-28bc-8005-96b6-9c089b0cc5...

Anyways, that's a lot of compute. Not quite sure why its necessary for a plus account. Would love to get some thoughts on this?

reply
tbrownaw 9 hours ago
Sure it uses a few GB just like everything else these days, but some of the comments also mention it being slow?
reply
Aurornis 9 hours ago
The GitHub issue is AI generated. In my experience triaging these in other projects, you can’t really trust anything in them without verifying. The users will make claims and then the AI will embellish to make them sound more important and accurate.
reply
dylan604 9 hours ago
> AI will embellish to make them sound more important and accurate.

Did you mean than accurate rather than and accurate? Having a more accurate issue description only sounds like a good thing to me

reply
monsieurbanana 9 hours ago
Making them look more accurate is not the same as being more accurate, and llms are pretty good at the former.

Imagine a user had a vague idea or something that is broken, then the LLM will choose to interpret his comment for what it thinks is the most likely actual underneath problem, without actually checking anything.

reply
kace91 9 hours ago
“Seem important and accurate” is correct. It doesn’t imply actual accuracy, the llm will just use figures that resemble an actual calculation, hiding they are wild guesses.

I’ve run into the issue trying to use Claude to instrument and analyze some code for performance. It would make claims like “around 500mb ram are being used in this allocation” without evidence.

reply
seanhunter 9 hours ago
I read that as "make them sound more important and accurate than they actually are".
reply
Filligree 9 hours ago
To make them sound more accurate.
reply
cogman10 8 hours ago
Ok, so a lot of this boils down to the fact that this sort of software really wants to be running on linux. For both windows and mac, the only way to (really) do that is creating a VM.

It seems to me that the main issue here is painful disconnects between the VM and the host system. The kernel in the VM wants to manage memory and disk usage and that management ultimately means the host needs to grant the guest OS large blocks of disk and memory.

Is anyone thinking about or working on narrowing that requirement? Like, I may want the 99% of what a VM does, but I really want my host system to ultimately manage both memory and disk. I'd love it if in the linux VM I had a bridge for file IO which interacted directly with the host file system and a bridge in the memory management system which ultimately called the host system's memory allocation API directly and disabled the kernels memory management system.

containers and cgroups are basically how linux does this. But that's a pretty big surface area that I doubt any non-linux system could adopt.

reply
lxgr 8 hours ago
Given that Claude Code runs without issues on macOS, I'd guess that it's more about sandboxing shell sessions (i.e. not macOS applications or single processes, for which solutions exist).

Unfortunately, unlike Linux, macOS doesn't have a great out-of-the-box story there; even Apple's first-party OCI runtime is based on per-container Linux VMs.

reply
cogman10 7 hours ago
I think only BSD really has a good sandboxing solution beside linux (jails).

And after looking into Jails, it looks like BSD also supports linux cgroups... that's actually really impressive. [1]

[1] https://docs.freebsd.org/en/books/handbook/linuxemu/#linuxem...

reply
jjfoooo4 7 hours ago
The upgrade to the native installer gave me some issues, I had Claude fail to return any responses and continuously eat memory until my computer crashed! The only fix I could figure out is nuking my entire .claude dir, losing all my history etc with it
reply
kccqzy 7 hours ago
It’s a solved problem in the VM world too. Memory ballooning is a technique where a driver inside the VM kernel cooperates with the hypervisor to return memory back to the host by appearing to consume the memory from the VM. And disk access is even easier; just present a network filesystem to the VM.
reply
cogman10 7 hours ago
The network file system to host is usually pretty slow no? That was my impression.

As for memory ballooning, the main issue with it is that it (generally) only gets triggered when the host runs out of memory.

For a host which is only running VMs, this is fine. But for the typical consumer host it becomes cumbersome as you still need to give the VM a giant memory block and hope that your VM of choice is good enough to free on time. It's also uncoordinated. When swapping needs to happen, if the VM was using the host for allocation the host could much more efficiently decide what needs to go into swap.

And if the host was in charge of both the memory and file system, then things like a system cache could be done more efficiently on top of all that.

reply
10000truths 3 hours ago
> The network file system to host is usually pretty slow no? That was my impression.

NFS doesn't have to be slow. If you avoid traversing the TCP/IP stack, performance is fine. Linux guests can use vsock to communicate with the hypervisor directly, and macOS hosts can use the Virtualization framework to map a guest vsock to a host UNIX socket.

reply
exabrial 7 hours ago
I see this as a feature. The cost of isolation
reply
puppymaster 9 hours ago
macbook pro m4 bought last year. worked on so many codes and projects. never hot after closing lid. installed electron claude. closed lid and went to sleep and woke up to macbook that has been hot all night. uninstall claude. problem went away.

i kept telling myself this BUT NEVER ELECTRON AGAIN.

reply
lxgr 6 hours ago
To be fair, ChatGPT seems to be a native app and still somehow managed to continuously burn some 30-40% of CPU on my mac that ended up being attributable to some shimmer animation for two never-loading icons.
reply
DauntingPear7 8 hours ago
It’s not electron
reply
woadwarrior01 7 hours ago
The macOS Claude app is absolutely an electron app, which is what the github issue in this post is about.

If you'd like to verify for yourself: On your mac, right click on the Claude app icon and click on "Show Package Contents" and then navigate to Contents > Frameworks > Electron Framework.framework.

reply
rvz 7 hours ago
Yes it certainly is.
reply
bigyabai 6 hours ago
I don't know if Electron is the issue here, my Wintel machine has Claude Code running 24/7 and doesn't ever heat up.

Might be virtualization woes or something adjacent.

reply
hulitu 9 hours ago
> woke up to macbook that has been hot all night

this is usual reason for divorce /s

reply
brunooliv 8 hours ago
I really love Anthropic's models, but, every single product/feature I've used other than the Claude Code CLI has been terrible... The CLI just "sticked" for me and I've never needed (or arguably looked in depth) any other features. This for my professional dayjob.

For personal use, where I have a Pro subscription and adventure into exploring all the other features/products they have... I mean, the experience outside of Claude Code and the terminal has been... bad.

reply
msp26 8 hours ago
> every single product/feature I've used other than the Claude Code CLI has been terrible

yeah they're shipping too fast and everything is buggy as shit

- fork conversation button doesn't even work anymore in vscode extension

- sometimes when I reconnect to my remote SSH in VSCode, previously loaded chats become inaccessible. The chats are still there in the .jsonl files but for some reason the CC extension becomes incapable of reading them.

reply
yuppiepuppie 8 hours ago
I tend to agree here. Today, I tried to get the claude chat to give me a list of Jira tickets from one board (link provided) and then upload it to notion with some additional context. It glitched out after trying the prompt over again 4x. I eventually gave up and went back to the terminal.
reply
perbu 8 hours ago
Yes. This is my experience as well. The software quality is generally horrible. It surely has improved a lot over the last couple of months, but it is still pretty horrible.

It is quite normal for me to have to force-close Claude Desktop.

reply
throwaway613746 2 hours ago
[dead]
reply
bichonnages 4 hours ago
In the meantime, I deleted the virtual machine and the Claude application. I simply created a web app through Safari. It works very well.
reply
_orcaman_ 5 hours ago
A better UX would be to prompt the user, asking "Would you like to use the app in a sandbox for enhanced safety?" and only then download the Ubuntu linux image used in the VM
reply
pama 8 hours ago
Aren't most these people recommending random tools in the github chat for this entry just attempting to exploit naive users? Why would anyone in this day and age follow advice of new users to download new repos or click at random websites when they already attempt to use claude code or cowork?
reply
nhubbard 7 hours ago
While I generally agree with your sentiment, these tools aren't bad ones:

- Santa is a very common tool used by macOS admins to lock down binary and file access privileges for apps, usually on managed machines

- Disk Inventory X and GrandPerspective are well-known disk space usage tools for macOS (I personally use DaisyDisk but that requires a license)

- WizTree and WinDirStat are very common tools from Windows admin toolkits

The only one here I can say is potentially suspect is ClearDisk. I haven't used it before, but it does appear to be useful for specifically tracking down developer caches that eat up disk space.

reply
andresquez 9 hours ago
Way slower, but way better than chat mode. Nothing beats Claude Code CLI imo.
reply
Aurornis 9 hours ago
This GitHub issue itself is clearly AI slop. If you’ve been dealing with GitHub issues in the past months it will be obvious, but it’s confirmed at the end:

> Filed via Claude Code

I assume part of it is true, but determining which part is true is the hard part. I’ve lost a lot of time chasing AI-written bug reports that were actually something else wrong with the user’s computer. I’m assuming the claims of “75% faster” and other numbers are just AI junk, but at least someone could verify if the 10GB VM exists.

reply
16bitvoid 8 hours ago
If your codebase is entirely vibe coded, I feel it only appropriate to permit issues being vibed as well. It's hypocritical otherwise.
reply
rzzzt 5 hours ago
Use an agent to summarize and generate reproducers for each report, another to select issues to be fixed in the next iteration, a third one to implement changes, a fourth for code review...
reply
chuckadams 9 hours ago
I wouldn't think it's inappropriate for an AI agent to file an issue against another AI agent, which itself is largely written by AI.
reply
game_the0ry 9 hours ago
Yeah, that's why I do not install these tools on my personal devices anymore and instead play with them on a VPS.

Try this if you have claude code -- ls -a your home dir and see all the garbage claude creates.

reply
sometimez 5 hours ago
Same thing on Windows. The VM bundle is at %AppData%\Claude\vm_bundles
reply
anotheryou 7 hours ago
Mac Problems...

so crazy on a windows desktop I at most complain if it is hardcoded to the system drive (looking at you ollama)

reply
elzbardico 7 hours ago
This is exactly the kind of issues we will see more and more frequently with vibe-coding.
reply
kordlessagain 8 hours ago
The amount of bad things this companies software does is staggering. The models are amazing, the code sucks.
reply
AlexeyBrin 8 hours ago
Their code is written by their amazing models (this is what they claim anyway).
reply
Robdel12 7 hours ago
Hey, they did admit that they vibed this in a week and released it to everyone.
reply
fooker 8 hours ago
That seems somewhat reasonable.

Storage should be cheaper, complain about Apple making you pay a premium.

reply
jFriedensreich 9 hours ago
Its just another example and just a detail in the broader story: We cannot trust any model provider with any tooling or other non model layer on our machines or our servers. No browsers, no cli, no apps no whatever. There may not be alternatives to frontier models yet, but everything else we need to own as true open source trustable layer that works in our interest. This is the battle we can win.
reply
prmph 8 hours ago
Why don't people form cooperatives, contribute to buy serious hardware and colocate them in local data centers, and run good local models like GLM on them to share?
reply
jFriedensreich 8 hours ago
We are starting to! TBH it will take some time until this is feasible at larger scale but we are running a test for this model in one of my community groups.
reply
fragmede 9 hours ago
What's funny is interacting with it in claude code. Claude-desktop-cowork can't do anything about the VM. It creates this 10 GiB VM, but the disk image starts off with something like 6-7 GiB full already, which means any of the cowork stuff you try to do has to fit into the remaining couple of gigs. It's possible to fill it up, and then claude cowork stops working. Because the disk is full. Claude cowork isn't able to fix this problem. It can't even run basic shell commands in the VM, and Opus4.6 is able to tell the user that, but isn't smart enough/empowered to do anything about it.

So contrary to the github issue, my problem is that it's not enough space. So the fix is to navigate to ~/Library/Application\ Support/Claude/vm_bundles, and then ask Claude Code to upsize the disk to a sparse 60 GiB file, giving cowork much more space to work in while not immediately taking up 60 GiB.

Bigger picture, what this teaches me though, is that my knowledge is still useful in guiding the AI to be able to do things, so I'm not obsolete yet!

reply
pixl97 9 hours ago
So it's using it's binary disk/image as the cache/work disk also?

Yea, that's a receipt for problems.

reply
wutwutwat 7 hours ago
Are we sure that this isn't a sparse image? It will report as the full size in finder, but it won't actually be consuming that much space if it's a sparse image
reply
daemonk 7 hours ago
Just write a Claude OS already.
reply
mixdup 9 hours ago
All code in Claude™ is written by Claude™
reply
jug 9 hours ago
Also apparently eating 2 GB RAM or so to run an entire virtual machine even if you've disabled Cowork. Not sure which of this is worse. Absolute garbage.
reply
crumpled 9 hours ago
The software seems to get into more and more and communicate about what it's doing less and less. That's the crux.

Pondering... Noodling... Some other nonsense...

reply
aplomb1026 6 hours ago
[dead]
reply
bear3r 8 hours ago
[dead]
reply
TheRealPomax 8 hours ago
labelled "high priority" a month ago. No actual activity by Anthropic despite it being their repo. I'm starting to get the feeling they're not actually very good at this?
reply