Bootc and OSTree: Modernizing Linux System Deployment
99 points by mrtedbear 21 hours ago | 38 comments

pojntfx 15 hours ago
bootc and OSTree are both very neat, but the leading edge of immutable Linux distros (GNOME OS, KDE Linux) is currently converging on a different proposal by systemd developers that's standardized by the UAPI Group (https://uapi-group.org/specifications/). It fixes quite a few of the complexities with OSTree (updates are handled by `systemd-sysupdate`/`updatectl` and are just files served via HTTP) and is quite a bit easier to extend with things like an immutable version of the Nvidia drivers or codecs thanks to system extensions handled by `systemd-sysext` (which in turn are just simple squashfs files overlayed over `/usr`) and configuration via `systemd-confext`. `mkosi`, also by systemd, is quickly becoming _the_ way to build custom images too, and is somewhat tied to these new standards.
reply
aaravchen 6 hours ago
sysexts are indeed a very interesting feature, though it really only complements some other whole-system solution since it can only affect files under a non-root folder location.

I'm struggling to see how sysupdate is really equivalent to bootc or ostree though. Sysupdate is just the sw-update tool from Yocto that's been around for 10+ years with a little more syntax sugar, which itself was just a common shared implementation of what all embedded systems had been rolling-thier-own of for almost 20 years before that. It says it requires an A/B/.../N partitioning scheme, which is exactly what bootc/ostree/etc is designing to avoid.

If you don't use the whole disks update thing from sysupdate, then instead you're just talking about a transactional package manager that is still in its infancy and hasn't addressed the many gotcha and corner cases that the dozens of more mature package managers have. It's not actually "transactional" in the sense of undo for example, it's "transactional" only in that you won't get partial install, which hasn't been a problem with any existing package managers for almost 40 years. All thier listed things you can list together for a "transaction" association are either things that are already linked via existing package maager packages, or are only useful for embedded systems.

I'm not saying sysupdate isn't useful, upper end of embedded design is pushing into the space where systemd is standard now so it could be useful for those devices, but it's not really equivalent at all to bootc/ostree, and doesn't really have amt applicability outside initial system imaging from a live disk, or embedded devices.

reply
doug-moen 7 hours ago
GNOME OS and KDE Linux are both specialized distros that primarily exist to test GNOME and KDE. They aren't for general users, and both web sites warn you not to rely on them. And they impose limitations on your ability to install arbitrary 3rd party software, whereas Fedora Atomic Desktop lets you customize the system without such limitations. Fedora Atomic lets me install arbitrary RPMs into the base system.

"quite a bit easier to extend" sounds good to me, but the "easier" here refers to the internal system implementation details? I am an end user, not a Linux distro system architect, and I care more about the user experience. I will be interested in test driving a general purpose OS based on this technology, whenever that happens in the future. Since Red Hat is involved in the UAPI project, perhaps Fedora Atomic Desktop will migrate to this technology in the future?

reply
jcastro 7 hours ago
> is currently converging on a different proposal by systemd developers that's standardized by the UAPI Group

We're working in this space with Project Bluefin: https://github.com/projectbluefin/dakota

Both approaches are indeed competitive, but you can also leverage both to achieve the same thing. We're experimenting with a pure ddi Bluefin, a buildstream/GNOMEOS one that spits out a bootc image, as well as a Bluefin that is just a systemd-sysext on top of GNOME OS. Chef's choice!

There will be many ways to slice this problem -- my opinion is that in the end it will be how you design the infrastructure to make these and not the artifacts themselves.

We already have CentOS/Fedora builds alongside these, long term we'll see which ones end up being the most efficient. Buildstream is a tool which people should look at in this space too: https://buildstream.build/index.html

reply
aaravchen 7 hours ago
Ironically the first implementation of ostree required an HTTPS server to serve the ostree commits, allowing a much smaller subset of what's needed to be transferred. However that became an adoption hurdle since it required unique infrastructure. Ostree switched to using containers because zstd allows compressed chunks now rather than the old all-or-nothing image layers, and the existing widespread container image registry infrastructure could be reused without modification. And both utilize layers for their construction so there were possible benefits there (that never really materialized, but are still available to pursue).
reply
rurban 15 hours ago
Typo: (CoreOS and Fedora Silverblue) are the bleeding edge of immutable distros. Those mentioned are just users.
reply
pojntfx 14 hours ago
GNOME OS uses BuildStream and as a result has no concept of packages at all and no relationship to any distro, KDE Linux is based on Arch. There is no relationship between the two. GNOME OS used to be OSTree based but switched to systemd-sysupdate a while ago.
reply
aaravchen 7 hours ago
CoreOS is in a weird space. It's been desperately playing catch up with it's sibling products for the last few years, but it also is where a lot of the Fedora/RHEL developers in this space are focused primarily.
reply
debugnik 9 hours ago
It's confusing, but there're apparently distros called "GNOME OS" and "KDE Linux" now.
reply
smashed 15 hours ago
> the bleeding edge of immutable Linux distros (GNOME OS, KDE Linux)

These are words but they don't make sense.

reply
pojntfx 14 hours ago
Corrected - I meant leading edge.

Context re:distros mentioned:

GNOME OS: https://os.gnome.org/ KDE Linux: https://kde.org/linux/

reply
7777777phil 10 hours ago
Doubling progression-free survival (17.6 vs 7.4 months) is a large effect size for recurrent prostate cancer.
reply
Scipio_Afri 10 hours ago
I’m pretty sure Linux doesn’t have a prostate, even with all the changes in the leading edge distros, and you’re commenting in the wrong post.
reply
7777777phil 7 hours ago
Pretty sure I didn’t want to post that here. But then I got rate limited and upon coming out of rate limit jail blindly pasted this comment where my page reloaded - my bad should have been here: https://news.ycombinator.com/item?id=47193047
reply
coderedart 10 hours ago
Kdelinux uses pacman for now, but the eventual goal is systemd-sysext based mkosi images.

They are also considering moving to buildstream and join gnome.

reply
n42 14 hours ago
"some of the newer ideas happening in this space are in the GNOME OS project and the KDE Linux project"
reply
hparadiz 14 hours ago
My Gentoo box is immutable. Right up until I run emerge.
reply
znpy 15 hours ago
From https://uapi-group.org/ :

> Contributing members include people from Ubuntu Core, Debian, GNOME OS, Fedora CoreOS, Endless OS, Arch Linux, SUSE, Flatcar, systemd, image-builder/osbuild, mkosi, tpm2-software, System Transparency, buildstream, BTRFS, bootc, composefs, (rpm-)ostree, Microsoft, Amazon, and Meta.

Note systemd, (rpm-)ostree and bootc.

My understanding is that uapi is another initiative but not completely separated from bootc and ostree. Maybe complementary.

reply
pojntfx 14 hours ago
Sorry, not completely separate, yes, but some of the parts of the standard (e.g. systemd-sysext for layering "packages") and closely related things like systemd-sysupdate do actually replace parts of this (esp. ostree).
reply
anglesideangle 5 hours ago
I often see bootc and/or buildstream uncritically presented as the future of the linux desktop, and find it somewhat surprising because, in my experience, they are both more complex _and_ less capable than nixos.

To elaborate, I will separate linux operating systems into three categories:

1. traditional mutable package management (debian, fedora, arch, etc) - Bad model for obvious reasons, I won't get into this.

2. immutable images (embedded deployments, android, chromeos) - A build system (yocto, nix, buildroot, buildstream) procudes a disk image. System boots the disk image and handles updates using A/B root partitions with `systemd-sysupdate` or similar.

3. store-based atomic (nixos, ublue) - The system keeps a mutable store of hashed objects (/nix/store, /var/lib/containers/storage, /ostree/repo) and boots a specific system generation. Updates add new objects to the store, which must be automatically cleaned, and create a new systemd generation to boot into.

In the case of categories 2 and 3, a build system of some kind is used to produce the image or packages that are turned into the new system generation.

The bootc project, which falls into category 3, attempts to use the standardized and highly adopted OCI image format (layered filesystem changes stored in a content addressed store) as a medium for distributing linux systems. The major limitation here is that these systems are very complicated to build and extend. While it may not seem that way compared to nixos if you have prior experience writing dockerfiles, configuring your system with imperative statements that build on previous state is _really_ tedious. For example, you can't just re-use work with multiple `FROM` statements in the same layer, so you instead need to copy files between images. This is incredibly jank, look at [the docs](https://blue-build.org/how-to/minimal-setup/) for bluebuild's module system. Additionally, for a motivated user to change their system internals, they need to make the jump to hosting it with CI and pulling the images.

As jcastro mentioned elsewhere in this thread, there is work in the ublue project, which focuses on bootc images, to instead build their systems from source using buildstream, the same way GNOME OS works. The idea is that this is no longer a "distribution" and doesn't have "packages" anymore, since the entire system is built from source into an image that can be updated to atomically. While this model is simpler and way less jank than assembling your OS in a dockerfile, the model of separating packaging from distribution makes things harder for users for no benefit over the alternative.

To elaborate: Buildstream is a build system created by gnome developers that works as follows:

- Various element are defined in yaml files with their dependency and build steps

- Buildstream forms a tree of build elements to evaluate

- Each element is evaluated in a sandbox with access to only its dependencies and the output is placed in a content-addressed store for caching

If you have experience with nix, you may recognize that this is almost the same way nix works, with the difference that buildstream dependencies are mounted to regular directories (e.g. /usr) in the build sandbox instead of directly dealt with as /nix/store objects, and that buildstream is much harder to extend. A nix build results in artifacts that live in the nix store and are symlinked to other paths in the nix store, while a buildstream build results in artifacts that are compatible with a more traditional directory structure.

The idea of GNOME OS being "distroless" and not having packages is misleading, as it does have packages. Only, the packages exist exclusively during the build process. In order for a user to modify their system, they must add a rule to the buildstream definitions their system is built from and rebuild the entire thing from source to generate a new system image, which is unnecessarily burdensome. This is because the content-addressed artifact store (buildstream's cache) that exists when building has no relationship to the content-addressed artifact store on the deployed system (ostree or oci storage). This is pointless indirection with no clear benefit. By not separating building from distribution, nixos (a project using the nix dsl to build a linux system) achieves the same benefits of this model without any of the drawbacks. Users can modify how their system works, use caches, add their own packages, and share/integrate these modifications freely without building their entire os again from source every time.

To put this power in perspective, adding a nginx server to your system on nixos amounts to adding to your nix configuration:

services.nginx.enable = true;

and then rebuilding, switching to the new system generation.

Imagine how painful it would be to do this via a dockerfile (based off a different image) or buildstream definitions...

Finally, I'd like to clarify that nixos is not perfect. There are many areas of that need improvement (documentation, evaluation speed, evaluation caching, ifd, error messages, etc). However, I believe nixos is a fundamentally better and simpler model than the one being pushed by a lot of the linux desktop ecosystem at the moment. I believe a lot of the work on infrastructure like bootc and buildstream would be better focused on nix/nixos, or at least would benefit from learning from them.

reply
lproven 13 hours ago
> bootc and OSTree are both very neat

May I rephrase that?

bootc and OStree are both Cthulhoid nightmare horrors that only exist because of corporate politics, but the leading edge...

reply
kj4ips 2 hours ago
I like the idea behind ostree and bootc, but I feel that OCI (with tarlayers) is not a good fit. `repack` makes an absolute hash of things, and since the layers are logically packaged, they will have to be composed somehow, and then ostree becomes only slightly more useful than coreos's A/B usr.

OCI roughly assumes that layers will be laid out in some logical way, and that a given host will see opportunities to reuse across different instances, but with bootc, there will only ever be one instance.

OCI also assumes that individual layers are small enough that it is always worth pulling and unpacking a layer instead of some kind of authentication delta, which is great for a k8s cluster in a center, but not great for devices out on the edge, where you might want this kind of pseudo-immutable system even more.

I really want some standardized way for a manifest in OCI to say that "this content is also available in other format X here".

reply
YorickPeterse 11 hours ago
For those looking for a more extensive article about bootc, I recently wrote about using it in https://yorickpeterse.com/articles/self-hosting-my-websites-..., including a comparison to some other existing tools.
reply
samhclark 6 hours ago
Personally, I've really enjoyed using bootc for both my personal laptop and my NAS.

I really like the NAS use case because I can build the ZFS kmods for that specific version of Fedora CoreOS in CI/CD. If there's any compatibility failure, then my NAS doesn't get an update and I get the CI/CD failed email. No downtime because of some kernel incompatibility.

For the laptop though, I feel like there's a better way that I haven't found. Some way not to require CI/CD, to build the next image and switch to it all locally. I haven't gone down that path yet, but it looks kinda like that Option 2 the author described. Maybe it's really just that easy.

I've really been enjoying this space.

reply
iamcalledrob 9 hours ago
I'd love to have my system be declared in code, so I can replicate the same environment across a laptop and a desktop with minimal drift.

So same OS, users, packages, flatpaks etc. And a mostly synced home dir too.

Is NixOS the only viable way to do this? I don't like the path mangling that Nix introduces.

It seems like an immutable distro customized via a Containerfile could work too? Except rebooting/reimagine for every change sounds tedious as hell.

reply
aaravchen 6 hours ago
All the immutable system solutions out there pretty much all make your rootfs immutable, but leave your home folder and system config folders (i.e. /var and /etc) as mutable. It's pretty obvious that if you make the config folders and/or home folder immutable it starts causing most people problems, since in the vast majority of cases people just want to be able to persistently change the desktop background color or spaces vs tabs setting in their IDE without having to locate the setting in a full system config, set it, and regenerate.

This does cause some interesting tension in the immutability though. /etc in particular is really a mix of things that a sysadmin should really only be setting, and things a regular user may set indirectly. This usage has grown organically over time with the tools involved in the implementation, so it's not at all consistent which are which. The immutable system solutions recognize this by usually handling the whole /etc folder the same way package managers handle package installs that include /etc file: by doing a 3-way merge between the old provided files, the new provided files, and the current existing files to see if the existing are unchanged from the old provided and can just be directly replaced by the new provided or if a merge conflict needs resolving. Additionally, a separate copy of /etc is maintained associated with each available bootable system version so when you roll back you get the old /etc files you had before. Though this does introduce a system-unique variation since you now have new /etc being affected by the state of /etc when it was forked.

If you want all your home folder and system config to be identical, nix or guix really are your primary way to go, that extra lockdown of the user and system config is exactly what most people don't want for usability reasons.

I personally use nix home-manager on top of Aurora DX from Universal Blue. I have my nix home-manager config setup to manage only the things I want to be locked down in my home config, and to provide some extra tools that are easier to manage/supply via Nix than a system package manager (where I would need to do a whole system update to get the new version). My IDE for example is installed on a specific version via Nix, but I don't have Nix manage the settings of it so I can separately tweak as needed without need a home-manager rebuild.

EDIT: typo

reply
jcastro 7 hours ago
> customized via a Containerfile could work too? Except rebooting/reimagine for every change sounds tedious as hell.

You can do this today with Aurora, Bazzite, Bluefin, and other bootc systems. The system updates by default are weekly and require a reboot but when you move most of the stuff into the userspace most of that stuff updates independently anyway.

reply
aaravchen 7 hours ago
In fact, if you want to use something like Nix on a UniversalBlue system, you have to spin your own. The "hotfix" and chattr solutions of pre-composefs don't work anymore. Anything that needs to go into a read only location and isn't package as an RPM requires you to "spin your own".

Luckily UniversalBlue makes it incredibly easy, they have a template repo you can use that has all the GitHub action setup included to auto-bills on every change, and directions for how to set it up. It took me about 10 minutes

reply
azibi 14 hours ago
We use TorizonOS, which is also based on OSTree: https://www.torizon.io/blog/ota-best-linux-os-image-update-m....

It works quite well for our edge devices. It’s tightly integrated with Toradex hardware, but not limited to it.

It may seems litte a niche, but it has strong potential for long‑term supported edge products. Any additional experiences to share?

reply
tuananh 12 hours ago
bootc is kind of perfect for edge. delivering OS update as a whole. ease of update/rollback.
reply
nicman23 10 hours ago
developers will do anything but to use a cow fs
reply
Borealid 19 hours ago
I like the idea of using the same format for kernel-included VMs as I use for containers.

Next up, backups stored as layers in the same OCI registries.

I am not, however, sure ostree is going to be the final image format. Last time I looked work was in progress to replace that.

reply
mroche 17 hours ago
It is not, the future is currently pointing to composefs:

https://github.com/bootc-dev/bootc/issues/1190

There's a GitHub org that builds bootc-ready images for non-Red Hat family distributions using this backend.

https://github.com/bootcrew

reply
lproven 13 hours ago
It is very odd to me to watch OStree-based distros starting to take off and win recruits.

The only reason Red Hat needed to invent this very complex mechanism was because RH does not officially have a COW-snapshot capable filesystem in its enterprise distro.

A filesystem with snapshots makes software installation transactional. You take a snapshot, install some software, and if it doesn't work right, you can revert to the snapshot. (With very slightly more flexible snapshots, you can limit the snapshot to just some part of the directory tree, but this is not essential; it merely permits more flexibility.)

In other words, you are a long way toward what in database language is called ACID:

https://en.wikipedia.org/wiki/ACID

Atomicity, consistency, isolation, durability. It makes your software inastallation transactional: an update either happens completely (A), you can check it is valid (C) and works (I), or it can be totally reverted, and the system restored to the earlier state (D).

That's a good thing. It means you can safely automate software deployment knowing that if it goes wrong you have an Undo mechanism. Databases got this 50+ years ago; in the 21st century it's making its way to FOSS OSes.

Do this in the filesystem and it's easy. SUSE's implementation is so simple, it's basically a bunch of shell scripts, and it can be turned on and off. You can run an immutable OS, reboot for updates, and if you need, disable it, go in and fix the system, and then turn it back on again.

This is because SUSE leans very heavily on Btrfs and that is the critical weakness -- Btrfs is only half finished and is not robust.

But RH removed Btrfs from RHEL and Btrfs was the only GPL COW filesystem, so core infrastructure in the distro means no COW on RH. Oracle Linux has Btrfs -- the FS was developed at Oracle, after all -- and so does Alma.

(Yes I know, Fedora put it back, but the key thing is, it only uses Btrfs only for compression so that Flatpak looks less horrendously inefficient. Fedora doesn't use snapshots.)

With no COW FS, RH had to invent a way to do transactional updates without filesystem support. Result, OStree. Git, but for binaries.

And yes, everyone developing FOSS uses Git, but almost nobody understands Git:

https://xkcd.com/1597/

You know that if there's an Xkcd about it, it must be true.

Embedding something you don't understand in your OS design is a VERY BAD PLAN.

With OStree your FS is a virtual one, it's not real, it's synthesized on the fly from a local repository. The real FS is hidden and can't be hand-edited or anything. It generates the OS filesystem tree on the fly, you see. OS-tree.

Use it just for GUI apps, that's Flatpak.

Use it for the whole OS, that's OStree. It is so mind-shreddingly complicated that you can't do package management any more, you can't touch the underlying FS. So you need a whole new set of layers on top: virtual directories on top of the main virtual directory, and some bits with extra pseudo-filesystems layered on top of that to make some bits read-write.

It's like the scene in the Wasp Factory where under the skull plate it's just writhing maggots. I recall in horror and revulsion when I see it.

So it's deeply bizarre to read blog posts praising all the cool stuff you can do with it.

reply
aaravchen 7 hours ago
That is pretty obviously how it started, and for the very reasons you describe. But there have been some other benefits that have come out of going down this alternate path as well. In particular, the remote composability and local deployment is extremely useful for "cattle" edge system deployment. Installing a package reacts to what's currently on the system when installing. Even something as simple as the order you install your packages in can affect the result. Not having to run an entirely duplicate "golden" system to generate snapshots on and then push them to the cattle from is a pretty nifty benefit.
reply
e12e 11 hours ago
> A filesystem with snapshots makes software installation transactional. You take a snapshot, install some software, and if it doesn't work right, you can revert to the snapshot. (With very slightly more flexible snapshots, you can limit the snapshot to just some part of the directory tree, but this is not essential; it merely permits more flexibility.)

Eh, you don't typically have a lock mechanism for the filesystem equivalent to that of a database.

Who's to say something like this doesn't happen:

  - snapshot fs
  - op/system adjust firewall rules
  - "you" install updates
  - you rollback
  - firewall rules is now missing patches
Don't get me wrong zfs is great - but it doesn't come with magical transactions.
reply
debugnik 9 hours ago
A snapshot is taken before installing updates, so you'd get two snapshots from your example. Rolling back would leave you right after adjusting firewall rules.
reply
selfhosting_sh 16 hours ago
[dead]
reply