6.3 KiB
μVMs
Slowly comprehending spectrum-os and microvm.nix, by reproducing bits and pieces using NixOS, systemd, and cloud-hypervisor.
Prior Art
People
Following the right people is one of the most effective ways to maintain bibliography, a form of "importance sampling". The following are the people whose work, at the time of writing, I already know to watch out for. Whom am I missing?
- Alyssa Ross
- Via [spectrum.org]https://spectrum-os.org/bibliography.html) and Nixpkgs.
- Demi Marie:
- Via spectrum-devel and Qubes.
- Thomas Leonard
- Astro:
- Via microvm.nix
- Jean-Philippe:
- Joanna Rutkowska:
- ...
Projects
- Obvious: Qubes, Spectrum, Genode, ...
- Google ChromeOS and AWS Firecracker:
- Asahi muvm: microvms using libkrun and virtio-gpu "native context".
- AppVM: apparently available in Nixpkgs under
nixos/modules/virtualisation/appvm.nix
. Based on (NixOS,) qemu and libvirt. - ...
Timeline
The following are the questions I'd like to eventually answer about how virtualization happened:
- Was Qubes the first attempt at isolating e.g. peripheral and network devices using virtualization?
- Was Chromium OS the first and (or) the main driver for paravirtualized devices?
- ...
Choices
The following are some of the current "am I holding this right?" questions:
-
Memory: it is said that Linux always needs some swap in order not to be weird; what is the general scheme for allocating memory and swap to the hypervisor and the guests?
Most of the day I'm using a laptop with 8GiB of RAM; this necessitates frequent OOMs, even when not running VMs.
- It seems that hotplug memory is generaly preferred to ballooning.
- It seems that we want the hotplug "banks" to not be too small, as to avoid fragmentation.
- Zswap or zram on the hypervisor? On the guest?
- How much swap to allocate on the hypervisor? On the guest?
-
Startup time: lightvm claims boot time of 2.3ms; can we ever achieve comparable numbers with NixOS and systemd? What is the first bottleneck?
-
Guest-to-guest communication: in order to implement configurations similar to Whonix (e.g. Tor in a separate VM), guests need to be able to talk to each other directly, without exposing the hypervisor to their traffic. Generally, I've heard of three solutions to guest-to-guest communication:
- NAT via hypervisor,
- MACVTAP,
- and vhost-device-vsock.
I've only ever implemented the first. The second is something of a bridge, but the packets (frames?) never enter the hypervisor's network stack. I've only recently learned about the third: mentioned by Alyssa in the spectrum matrix chat. I do not entirely understand how it fits into the bigger picture.
-
Filesystems:
Virtio-blk appears to be the way to allocate persistent storage for VMs that require it. In practice this means allocating a zvol or a contiguous file on the hypervisor, to be exposed to the guest as a block device. One suspicion I have is that allocating a CoW filesystem (e.g. xfs, btrfs) on top of another CoW filesystem (e.g. zfs, as in xfs-on-zvol) may have non-trivial implications for fragmentation, depending on parameters like the chunksizes. On the other hand, use-cases like
/nix/store
offer serious deduplication opportunities, and I'm generally not usingsnix-store
just yet. -
...
Why Not?
spectrum-os
...is in active development and not advertised as user-ready yet. Spectrum OS appears to be a balance-shifting project, building up towards a principled solution, which must require patience... It does not, for example, reuse NixOS systemd modules, but uses s6 instead.
microvm.nix
Is inherently static.
A cynical spin on microvm.nix
would be, and I mean it with utmost respect, that it's a glorified qemu flags generator, written in Nix.
When using microvm.nix
you write, for example, each TAP's hwaddr
by hand, and then rebuild the "runner script".
When using the "fully-declarative mode" you also engangle the guest's and the hypervisor's life cycles, and double the NixOS evaluation time.
Microvm-nix ships support for a wide selection of different hypervisors, but you may only care about e.g. cloud-hypervisor
.
An instructive reference implementation and a convenient entry point, microvm.nix
may not be a direct or complete answer to the question "what does a life cycle of a microvm-deployed service look like".
appvm
I only noticed the option in man configuration.nix
a few days ago,
so I just never tried. Long-term I'd definitely prefer not to use qemu.