AWS Adds support for nested virtualization
290 points by sitole 24 hours ago | 112 comments

boulos 22 hours ago
I feel vindicated :). We put in a lot of effort with great customers to get nested virtualization running well on GCE years ago, and I'm glad to hear AWS is coming around.

You can tell people to just do something else, there's probably a separate natural solution, etc. but sometimes you're willing to sacrifice some peak performance just have that uniformity of operations and control.

reply
alexellisuk 16 hours ago
This is great news for folks that use microVMs - "we only use AWS" has been an issue for our stuff (slicer services/sandboxes/actuated self-hosted GitHub runners)

If anyone here can't wait (as it looks like there's very little info on this at the moment..)

I wrote up detailed instructions for Ant Group's KVM-PVM patches. Performance is OK for background servers/tasks, but does take a hit up to 50% on complex builds like Kernels or Go with the K8s client.

DIY/detailed option:

https://blog.alexellis.io/how-to-run-firecracker-without-kvm...

Fully working, pre-built host and guest kernel and rootfs:

https://docs.slicervm.com/tasks/pvm/

I'll definitely be testing this and comparing as soon as it's available. Hopefully it'll be accelerated somewhat compared to the PVM approach. There's still no sign whether those patches will ever end up merged upstream in the Linux Kernel. If you know differently, I'd appreciate a link.

Azure, OCI, DigitalOcean, GCE all support nested virt as an option and do all take a bit of a hit, but it makes for very easy testing / exploration. Bare-metal on Hetzner now has a setup fee of up to 350 EUR.. you can find some stuff with 0 setup fee, but it's usually quite old kit.

Edit: this doesn't look quite as good as the headline.. Options for instances look a bit limited. Someone found some more info here: https://x.com/nanovms/status/2022141660143165598/photo/1

reply
indigodaddy 5 hours ago
Why would we need PVM if AWS now supports nested virt?
reply
PunchyHamster 15 hours ago
> Bare-metal on Hetzner now has a setup fee of up to 350 EUR.. you can find some stuff with 0 setup fee, but it's usually quite old kit.

I don't understand what you are paying for here, nested virtualization doesn't need any extra setup for hardware compared to normal one

... or you are saying Hetzner wants 350 EUR for turning on normal virtualization option in BIOS ?

reply
krab 15 hours ago
Hetzner charges a fee for setting up your bare-metal machine. Often zero for their smaller machines and for those in auction. Probably they don't want someone to order a large fleet large of machines for one month and then cancel. They might not get another customer for those machines soon.
reply
PunchyHamster 17 minutes ago
...but servers come with virtualization on by default for like... at least a decade if not more

So they literally want money to fix what they fucked up the first time

reply
mkesper 11 hours ago
reply
krab 11 hours ago
Good context. They're commenting only on why are they increasing some setup fees though, not justifying their existence. The Hetzner setup fees were in place already before the RAM price hike.
reply
alexellisuk 11 hours ago
They used to charge a fair admin fee like 30-70 EUR for most bare-metal hosts.. now it's 99 EUR for the most basic/cheapest option.. up to 350 EUR for something modest like a 16 Core Ryzen.. monthly fees haven't changed much.

https://www.hetzner.com/dedicated-rootserver/matrix-ex/ https://www.hetzner.com/dedicated-rootserver/matrix-ax/

reply
PunchyHamster 17 minutes ago
Feels weird to roll it in into setup fee vs monthly price
reply
sidewndr46 10 hours ago
I've never used Hetzner because their terms of service didn't make any sense to me, but a 350 EUR fee for each setup? That almost seems like they don't want business. Every bare metal host I've used had a management interface I could submit a job to in order to reprovision my host at any time. Some even offer a recovery console through this. It takes 1-10 minutes but I'm assuming it was out of band management based, not human interaction.

Worst case I ever had a hard drive failed and I had to wait I think a week for OVH to physically replace it.

reply
lelandbatey 9 hours ago
Hetzner offers uniquely cheap dedicated hosting, even beating OVH. Per their statement about the fees, they're having to do this because without the setup fees, recent hardware prices increases would otherwise raise the price of acquiring new hardware so high that they would essentially never make a profit on the hardware they would have to buy for new orders. They're also saying that their overall prices are going to have to increase if the hardware prices don't change soon. Thus they are charging more for setup while keeping their monthly prices low, or at least trying to for now: https://www.hetzner.com/pressroom/statement-setup-fees-adjus...
reply
sidewndr46 5 hours ago
That seems counter to a "pay as you go" or "pay what you use for" model. I'd rather have sky high monthly fees, so that I don't have a sunk cost.
reply
LunaSea 9 hours ago
You'll still pay 10x less than any of the cloud platforms.
reply
anurag 23 hours ago
This is a big deal because you can now run Firecracker/other microVMs in an AWS VM instead of expensive AWS bare-metal instances.

GCP has had nested virtualization for a while.

reply
Twirrim 18 hours ago
OCI supports it with Intel. I know it works with AMD, but we don't officially support that so far as I'm aware. The performance hit on AMD is bigger than Intel, last I looked.
reply
direwolf20 16 hours ago
You can use an expensive AWS VM instead of an expensive AWS bare–metal image. Does anyone realise how expensive AWS is, even in the best case?
reply
PunchyHamster 15 hours ago
It is expensive. But the point where it stops being expensive is far above most companies use case. If you're paying less than a developers salary for hosting you most likely won't see all that many benefits from moving.

Renting a server from cheaper hosting providers can be massive savings but you now need to re-invent all of the AWS APIs you use or might use and it's big CAPEX time investment. And any new feature you need, whether that's queue, mail gateway or thousand other APIs need to be deployed and managed first before you can even start testing.

It's less work now than it was before just due to amount of tools there are to automate it but it's still more work that you could be spending on improving your product.

reply
notyourwork 14 hours ago
Agreed. Some threads make the suggestion you replied to and seemingly fail to ignore the reality of business. Not all businesses want to insource all problems.
reply
re-thc 15 hours ago
> but you now need to re-invent all of the AWS APIs you use or might use and it's big CAPEX time investment

Or maybe you just never needed most of these in the first place. People got into this "AWS" mentality like it is the only way to do things. Everything had to be in a queue, event driven etc.

I'd argue not using AWS means simplifying things and it'll be less expensive not just in server cost but developer time.

reply
stoneforger 13 hours ago
You don't get how this works. You buy in AWS because everyone else is , so it's expected. It diffuses risk to your stock options. This also begets a whole generation of people who can only use cloud services so now you are more hard pressed to find people with experience to run things without the cloud. You also create a bigger expenses sheet so it shows you're investing and growing, attracting more investors. "We pay 10 mil in AWS , we're that big". It's classic perverse incentives feeding into a monoculture.
reply
re-thc 10 hours ago
> You don't get how this works.

You must know more than GPT. You just "know" and assume everyone else doesn't. Maybe think about the billion other possibilities you're missing.

reply
pezgrande 11 hours ago
System admins are probably cheaper that Cloud experts devops.
reply
PunchyHamster 24 minutes ago
Not at scale to run your own bunch of servers competently.
reply
rirze 10 hours ago
Good system admins? No.
reply
j45 11 hours ago
If you ever used the aws apis to begin with.

Folks are increasingly staying cloud agnostic - meaning install and run the open source package that a cloud packages yourself.

It’s surprising how many are ready to go today compared to 10 years ago.

reply
parhamn 23 hours ago
whats the ~ perf hit of something like this?
reply
largbae 23 hours ago
Nowadays nested just wastes the extra operating system overhead and I/O performance if your VM doesn't have paravirtualization drivers installed. CPUs all have hardware support.
reply
otterley 22 hours ago
As a practical matter, anywhere from 5-15%.
reply
firesteelrain 19 hours ago
Azure has had nested virt available for a while too. I used to run HyperV in cloud
reply
whopdrizzard 13 hours ago
Azure has recently announced "direct virtualization", which is a sort of logical nesting, in which users can sub-partition their L1 VMs into virtual L2 VMs that are technically siblings.

https://techcommunity.microsoft.com/blog/azurecompute/scalin...

(I work there)

reply
firesteelrain 10 hours ago
Cool, so that’s the new and preferred model for nested or sibling virt?
reply
whopdrizzard 9 hours ago
eventually yes, this is supposed to remove the perf tax of nested virtualization (less world/context switches on vm_exits) and unlocks some new use cases (pass through hardware from your VM to the sibling-guest).
reply
iJohnDoe 22 hours ago
Was hoping this comment would be here. Firecracker and microVMs are good use-case. Also, being able to simply test and develop is a nice to have.

Nested virtualization can mean a lot of things. Not just full VMs.

reply
HumanOstrich 18 hours ago
> Firecracker and microVMs are good use-case.

Good use-case for what?

reply
sorenbs 14 hours ago
We operate a postgres service on Firecracker. You can create as many databases as you want, and we memory-snapshot them after 5 seconds of inactivity, and spin them up again in 50ms when a query arrives.

https://www.prisma.io/postgres

reply
adobrawy 17 hours ago
Nowadays universal answer for "what? why?" is AI. AI agent needs VMs to run generated code in sandbox as they can not be trusted.
reply
HumanOstrich 17 hours ago
I don't think everyone should assume that AI is the answer to all questions. I was asking the person I replied to, thanks.
reply
ushakov 8 hours ago
We are running Sandboxes for AI Agents using Firecracker microVMS @ E2B
reply
j45 11 hours ago
The poster you asked can reply too - Postgres and microvms are worth considering nearly every time at the start.

Beyond encapsulation it greatly increases the portability of the software between environments and different clouds.

reply
BobbyTables2 21 hours ago
Is nested VMX virtualization in the Linux kernel really that stable?

The technical details are a lot more complex than most realize.

Single level VMX virtualization is relatively straightforward even if there are a lot of details to juggle with VMCS setup and handing exits.

Nested virtualization is a whole another animal as one now also has to handle not just the levels but many things the hardware normally does, plus juggling internal state during transitions between levels.

The LKML is filled with discussions and debates where very sharp contributors are trying to make sense of how it would work.

Amazon turning the feature on is one thing. It working 100% perfectly is quite another…

reply
matheus-rr 16 hours ago
Fair concern, but this has been quietly production-stable on GCP and Azure since 2017 — that's 8+ years at cloud scale. The LKML debates you're referencing are mostly about edge cases in exotic VMX features (nested APIC virtualization, SGX passthrough), not the core nesting path that workloads like Firecracker and Kata actually exercise.

The more interesting signal is that AWS is restricting this to 8th-gen Intel instances only (c8i/m8i/r8i). They're likely leveraging specific microarchitectural improvements in those chips for VMCS shadowing — picking the hardware generation where they can guarantee their reliability bar rather than enabling it broadly and dealing with errata on older silicon. That's actually the careful engineering approach you'd want from a cloud provider.

reply
HumanOstrich 18 hours ago
It's been around for almost 15 years and stable enough for several providers to roll it out in production the past 10 years (GCP and Azure in 2017).

AWS is just late to the game because they've rolled so much of their own stack instead of adapting open source solutions and contributing back to them.

reply
otterley 10 hours ago
> AWS is just late to the game because they've rolled so much of their own stack instead of adapting open source solutions and contributing back to them.

This is emphatically not true. Contributing to KVM and the kernel (which AWS does anyway) would not have accelerated the availability.

EC2 is not just a data center with commodity equipment. They have customer demands for security and performance that far exceed what one can build with a pile of OSS, to the extent that they build their own compute and networking hardware. They even have CPU and other hardware SKUs not available to the general public.

reply
briffle 7 hours ago
As do all the other cloud providers, that have had this for years. like GCP and Azure, for 9 years now.
reply
otterley 6 hours ago
Architecturally they’re all quite different.

If my sources are correct, GCP did not launch on dedicated hardware like EC2 did, which raised customer concerns about isolation guarantees. (Not sure if that’s still the case.) And Azure didn’t have hardware-assisted I/O virtualization ("Azure Boost") until just a few years ago and it's not as mature as Nitro.

Even today, Azure doesn’t support nested virtualization the way one might ordinarily expect them to. It's only supported with Hyper-V on the guest, i.e., Windows.

reply
laurencerowe 6 hours ago
Nested virtualisation with KVM works on the Linux GitHub Actions runners which I believe run on Azure.
reply
otterley 6 hours ago
GitHub says:

> While nested virtualization is technically possible while using runners, it is not officially supported. Any use of nested VMs is experimental and done at your own risk, we offer no guarantees regarding stability, performance, or compatibility.

https://docs.github.com/en/actions/concepts/runners/github-h...

reply
laurencerowe 2 hours ago
It seems to work for my https://github.com/libriscv/kvmserver tests at least.
reply
leetrout 21 hours ago
> Nested virtualization is supported only on 8th generation Intel-based instance types (c8i, m8i, r8i, and their flex variants). When nested virtualization is enabled, Virtual Secure Mode (VSM) is automatically disabled for the instance.
reply
sitole 24 hours ago
Support for nested virtualization has been added to the main SDKs. In the us-west-2 region, you can already see the "Nested Virtualization" option and use it with the new M8id, C8id, and R8id instance types.

This is really big news for micro-VM sandbox solutions like E2B, which I work on.

reply
blaz0 19 hours ago
This will make it easier to run automated tests in the Android emulator in CI using regular runners. It was a pain dealing with bare-metal instances just for that.
reply
fersarr 6 hours ago
When will AWS add a statement about being bound to professional secrecy (e.g s203 in Germany) so we use the LLM endpoints for sensitive industries https://repost.aws/es/questions/QUOuFPk9TLSUuClI_wYNmVCQ/ser...
reply
ohthehugemanate 21 hours ago
I wonder if this is connected to Azure launching OpenShift Virtualization on "Boost" SKUs? There are a lot of VMWare customers going to OpenShift Virt, and apparently the CPU/memory overhead on Azure maxes out around 10% under full load... but then hyper V has been doing a lot of work on it. No idea if nitro includes any of the KVM-on-KVM passthrough of full KVM, to give it an edge here.
reply
firesteelrain 19 hours ago
Azure has had nested virt for a while - maybe it’s related to OpenShift but you could run OpenShift on Azure for some time. I used to run HyperV in Azure on certain SKUs
reply
wmf 20 hours ago
Azure? OpenShift? "I don't think about you at all." — Matt Garman probably
reply
_zoltan_ 16 hours ago
you might not but a lot of very big enterprises use openshift on azure.
reply
gerdesj 23 hours ago
Could someone explain why this is might be a big deal?

I remember playing with nested virty some years ago and deciding it is a backwards step except for PoC and the like. Given I haven't personally run out of virty gear, I never needed to do a PoC.

reply
paulfurtado 23 hours ago
It is great for isolation. There are so many VM based containerization solutions at this point, like Kata Containers, gvisor, and Firecracker. With kata, your kubernetes pods run in isolated VMs. It also opens the door for live migration of apps between ec2 instances, making some kinds of maintenance easier when you have persistent workloads. Even if not for security, there are so many ways a workload can break a machine such that you need to reboot or replace (like detaching an ebs volume with a mounted xfs filesystem at the wrong moment).

The place I've probably wanted it the most though is in CI/CD systems: it's always been annoying to build and test system images in EC2 in a generic way.

It also allows for running other third party appliances unmodified in EC2.

But also, almost every other execution environment offers this: GCP, VMWare, KVM, etc, so it's frustrating that EC2 has only offered it on their bare metal instance types. When ec2 was using xen 10+ years ago, it made sense, but they've been on kvm since the inception of nitro.

reply
iangudger 19 hours ago
One of the big benefits that gVisor offers is that it doesn't require nested virtualization (or any virtualization). They released a new version that improves performance when not using virtualization a while back: https://gvisor.dev/blog/2023/04/28/systrap-release/
reply
firesteelrain 19 hours ago
When you run nested virt, you can do multicast in Cloud between the nested VMs. You can’t do multicast across VMs inside the Cloud.

Basically you setup a small LAN with HyperV or something similar (I have only done it with HyperV)

reply
PunchyHamster 15 hours ago
It's when you want to do stuff with your own VMs and don't want to pay extra for bare metal machine, basically.

There is no real reason to use it on hardware you own; but in case of cloud you just not always have enough to do to excuse paying for whole entire server

reply
leoc 19 hours ago
Hopefully it mean that you can finally run a network simulator like GNS3 https://www.gns3.com/ in an AWS instance.
reply
UltraSane 23 hours ago
You can now run VMs inside a cheaper AWS instance instead of having to pay for an entire bare-metal instance. This is useful for things like network simulation where you use QEMU to emulate network hardware.
reply
dboreham 21 hours ago
If you have some workload that creates VMs, now you can run that workload on EC2 rather than having to use bare metal or some other provider that allows nested virtualization. There are many many such workloads. Just to give one example: testing a build system that spins up VMs to host CI jobs.
reply
blibble 23 hours ago
welcome AWS to 2018!
reply
ssl-3 23 hours ago
Yep. It's pretty boring. I've been using it at home for years and years with libvirt on very not-special consumer hardware. I guess the AWS clown is finally catching up on this one little not-new-at-all thing.
reply
otterley 22 hours ago
I was an Amazon EC2 Specialist SA in a prior role, so I know a little about this.

If EC2 were like your home server, you might be right. And an EC2 bare metal instance is the closest approximation to that. On bare metal, you've always been free to run your own VMs, and we had some customers who rolled their own nested VM implementations on it.

But EC2 is not like your home server. There are some nontrivial considerations and requirements to offer nested virtualization at cloud scale:

1. Ensuring virtualized networking (VPC) works with nested VMs as well as with the primary VM

2. Making sure the environment (VMM etc) is sufficiently hardened to meet AWS's incredibly stringent security standards so that nesting doesn't pose unintended threats or weaken EC2's isolation properties. EC2 doesn't use libvirt or an off-the-shelf KVM. See https://youtu.be/cD1mNQ9YbeA?si=hcaZaV2W_hcEIn9L&t=1095 and https://youtu.be/hqqKi3E-oG8?si=liAfollyupYicc_L&t=501

3. Ensuring performance and reliability meets customer standards

4. Building a rock-solid control plane around it all

It's not a trivial matter of flipping a bit.

reply
ssl-3 21 hours ago
There's no better way to get good information that is right, than to say something that is misguided and/or wrong.

Thanks for the well-reasoned response.

reply
QuinnyPig 22 hours ago
I always enjoy the color you add to these conversations. Thanks!
reply
sien 21 hours ago
I always enjoy the color you add to these conversations in your newsletter.

It's provided many a chuckle.

Thanks!

reply
raw_anon_1111 22 hours ago
Seriously curious, don’t Firecracker VMs already run on EC2 instances under the hood when they host Lambda and Fargate?
reply
wmf 21 hours ago
Since I don't work for AWS I'm allowed to say that at the scale of millions/billions of microVMs you're better off running them on bare metal instances to avoid the overhead of nested virtualization.
reply
otterley 20 hours ago
I used to work for AWS and I’m allowed to say the same thing. ;-)
reply
raw_anon_1111 10 hours ago
If I remember correctly, Firecracker VMs don’t have the same security guarantees as EC2 instances. I think I remember that AWS doesn’t put multiple accounts lambdas either on the same bare metal server or VM. I can’t remember which
reply
otterley 22 hours ago
Unfortunately I'm not at liberty to dive deep into those details. I will say that Firecracker can be used on bare metal EC2 instances, whether you're a public customer or AWS itself. :-)
reply
raw_anon_1111 21 hours ago
I guess I should have peeked at the source code when I was there…
reply
rescbr 20 hours ago
No need, at least when I was there when the day was still one, before the pandemic. And well, Firecracker is open source.

A few of the best technical presentations that I've watched were at a pre-SKO event. Nitro, Graviton and Firecracker.

Great engineering pieces, the three of them.

reply
PunchyHamster 15 hours ago
All that sounds like it would better be a contribution to KVM from the get go rather than invent stuff that eventually showed up in KVM anyway
reply
blibble 8 hours ago
it's been in kvm since the mid 10s

and in Xen (which they used to run) for at least as long

reply
PunchyHamster 25 minutes ago
I meant in general, as the linked presentation talked about many features, not just nested virt
reply
sitole 22 hours ago
Nitro is very interesting stuff
reply
tryauuum 15 hours ago
the only thing I know about nested virtualization is from the libvirt/KVM world too:

* you are right, it just works

* but there were scary notes about the stuff which might happen when you live migrate a virtual machine between hypervisors and the machine has nested virtual machines inside it. I remember the words "neither safe nor secure"

reply
csummers 10 hours ago
This a great news, but is there any more information about this other than an aws sdk commit? Is this generally available?
reply
ATechGuy 23 hours ago
Would love to see performance numbers with nested virtualization, particularly that of IO-bound workloads.
reply
anentropic 14 hours ago
Is this only when using the Go SDK?
reply
otterley 9 hours ago
Nah, it’ll show up in the others in their upcoming releases. Much of the code for the SDKs is autogenerated from JSON “API shape” files: https://github.com/aws/api-models-aws

Specifically, in this case: https://github.com/aws/api-models-aws/commit/8bca88a33592ca4...

reply
api 23 hours ago
What's the performance impact for nested virtualization in general? I'd think this would be adding multiple layers of MMU overhead.
reply
dwattttt 23 hours ago
From memory, the virtualisation operations themselves aren't nested. The VM instructions interact with the external virtualisation hardware, so it's more of a cooperative situation, e.g. a guest can create & manage virtualisation structures that are run alongside it.

I don't know if this applies to the specific nested virtualisation AWS are providing though.

reply
blibble 22 hours ago
depends on the workload and how they've done it

pure CPU should be essentially unaffected, if they're not emulating the MMU/page tables in software

the difference in IO ranges from barely measurable to absolutely horrible, depending on their implementation

traps/vmexits have another layer to pass through (and back)

reply
otterley 22 hours ago
As a practical matter, anywhere from 5-15%.
reply
aliljet 21 hours ago
I wonder if this will extend SEV-SNP and TDX to the child VMs?
reply
leetrout 21 hours ago
It says VSM is automatically disabled... so I would assume not.
reply
lofties 12 hours ago
Yo dawg, I heard you like virtualisation so we put virtual servers inside of your virtual servers.
reply
amelius 12 hours ago
But I'm sure their ToS doesn't allow you to run your own cloud platform inside AWS.
reply
cthalupa 10 hours ago
Why wouldn't it? Tons of their customers are providing cloud-like offerings hosted on AWS.

They're getting paid either way.

reply
dk8996 22 hours ago
Would these thing be good for openclaw, agents?
reply
CuriouslyC 22 hours ago
Yeah, though honestly if I'm deploying anything I'd just build an image with nix rather than use nested virtualization.
reply
dostick 20 hours ago
Proof that we’re living in a simulation.
reply
wmf 20 hours ago
I haven't watched The Thirteenth Floor in a while. The kids today don't even know about it.
reply
la64710 10 hours ago
How can this be replicated on prem?
reply
farklenotabot 23 hours ago
Sounds expensive for legacy apps
reply
j45 11 hours ago
It also makes me wonder how many other things I might not know that people are trying to do with cloud platforms that aren’t supported by them but have a negligible performance hit for many use cases.
reply
ilaksh 22 hours ago
I wonder if providers like Hetzner and Digital Ocean etc. will get this someday also.
reply
ubanholzer 16 hours ago
DO has Nested Virtualization enabled for years.
reply
bagels 23 hours ago
"* *Feature*: Launching nested virtualization. This feature allows you to run nested VMs inside virtual (non-bare metal) EC2 instances."
reply
amne 14 hours ago
obligatory: https://www.destroyallsoftware.com/talks/the-birth-and-death...

spoiler though: I'm referencing the part where gimp is running in Wine running in asm.js in a Chrome browser running in another asm.js in Firefox

reply
igtztorrero 21 hours ago
Digital Ocean has always supported nested virtualization.
reply
dangoodmanUT 23 hours ago
hell yes, finally
reply
andrewstuart 19 hours ago
Only took them 9 years. AWS so much innovation.

Remember, “customer obsession”.

But “protect revenue first”.

reply
otterley 9 hours ago
There’s no revenue protection here. You pay the same for an instance whether you’ve subdivided it into your own VMs or not.
reply