Create VM based on Ubuntu 20.04 and KVM cloudinit

Get Ready for the Disk Feast!

If one day you find yourself pondering how to make the most of a standalone server with virtualization support, consider employing Kernel-based Virtual Machine (KVM) technology to partition your server. KVM is a solution for performing hardware-assisted full virtualization tasks on supported hardware devices.

The article uses hosts with multiple IPv4 addresses as an example to configure VMs based on KVM virtualization technology using Cloud-init configuration files.

Virtualization technology

Before delving into a specific discussion on KVM technology, it’s meaningful to review various virtualization technologies we’ve used and understand how they relate to subsequent content. Virtualization technology in computing refers to presenting the host computer as multiple or one entirely different computer(s).

This section draws inspiration from the article “Virtual Linux(Invalid Link) on IBM Developerworks.

Hardware virtualization is a technology that hides the real computer hardware to build an abstract computing platform. In early computing, the operating system was referred to as the supervisor; an operating system capable of running on other operating systems was called a hypervisor. Virtual machine software like VirtualBox has the ability to virtualize an entire hardware computer.

Similarly, the QEMU project offers a capability known as hardware emulation. This technology can even run software designed for x86 architecture on an ARM processor host. The main issue with hardware emulation is its slow speed. Since every instruction must be emulated at the hardware level, it’s not uncommon for the speed to decrease by a factor of 100. Achieving highly accurate emulation, including cycle accuracy, emulated CPU pipeline, and cache behavior, can result in speed differences of up to 1000 times.

In terms of virtualization approaches, they are typically categorized into three types based on the degree of coordination between the host operating system and the computer hardware: full virtualization, paravirtualization, and operating system-level virtualization.

Full virtualization: In full virtualization, coordination occurs between the guest operating system and the underlying hardware. Specific privileged instructions must be intercepted and handled in the hypervisor because these low-level hardware resources are not owned by the operating system but are shared by the operating system through the hypervisor.

Paravirtualization: Paravirtualization integrates virtualization-related code into the guest operating system itself. It requires modification of the guest operating system for the hypervisor.

Operating system-level virtualization: Operating system virtualization implements server virtualization on top of the operating system itself. It supports a single operating system and can easily isolate independent servers from each other. Docker is an example of operating system-level virtualization.

Server aggregation or grid computing is a reverse form of virtualization that aggregates multiple servers to make them appear as a single, unified server.

Kernel-based Virtual Machine (KVM)

We fully recognize the advantages of KVM technology. KVM virtualization, based on hardware support, transforms the Linux kernel into a hypervisor using kernel modules. By making full use of features provided by the operating system kernel such as task scheduling and memory management, KVM enables oversubscription of resources on limited hardware. It also supports live migration, providing the ability to transfer running virtual machines between physical hosts without interrupting services.

The KVM in the kernel exposes virtualized hardware through the /dev/kvm character device. The guest operating system interacts with the KVM module interface through a modified QEMU process that simulates PC hardware. The KVM module introduces a new execution mode into the kernel. While the regular kernel supports kernel mode and user mode, KVM introduces a guest mode. Guest mode is used to execute all non-I/O guest code, while regular user mode supports guest I/O.

The KVM code is located in the kernel repository at https://git.kernel.org/pub/scm/virt/kvm/kvm.git/ and was introduced into the mainline kernel since version 2.6.20.

Configure Host Operating System

Enabling Intel VT/AMD-V technology required for KVM is supported on most processors. It’s worth noting that most Virtual Private Servers (VPS) purchased from cloud service providers do not support nested virtualization, meaning these servers cannot utilize KVM technology for virtualization.

It’s easy to use the kvm-ok command in Ubuntu systems to check if the host operating system supports KVM virtualization. The kvm-ok command is included in the cpu-checker package.

# kvm-ok
INFO: /dev/kvm exists
KVM acceleration can be used

Alternatively, for a host that does not support KVM virtualization:

# kvm-ok
INFO: Your CPU does not support KVM extensions
KVM acceleration can NOT be used

To fully utilize KVM technology, you need to install the following packages:

sudo apt install qemu-kvm libvirt-daemon-system libvirt-clients bridge-utils virtinst cloud-image-utils virt-manager -y

To fully utilize KVM technology, you need to install the following packages:

  • qemu-kvm: Contains the userspace components of KVM.
  • Packages starting with libvirt*: Include tools for managing the virtualization platform.
  • bridge-utils: Contains tools for configuring network connectivity between the host and virtual machines.
  • virtinst: Contains command-line tools for creating virtual machines.
  • cloud-image-utils: Contains tools for parsing cloud-init format files (will be used later!).
  • Optionally, virt-manager: Includes GUI tools for creating and managing virtual machines.

It’s necessary to add the current user to the libvirt and kvm groups to allow them to use KVM-related technologies.

sudo adduser `id -un` libvirt
sudo adduser `id -un` kvm

If the installation is successful, you can use the virsh command to view the current status of running virtual machines:

$ virsh list --all
 Id Name                 State
----------------------------------

Configure Host Networking

Typically, there are three ways to configure networking:

  1. Bridged Network: This allows VMs to bind their public IPv4 and IPv6 addresses if they need to be accessible from the public internet, suitable for standalone servers.
  2. Routed Network: If bridged network is not available, consider a routed network.
  3. NAT-based Network: If the server has limited IPv4 addresses, NAT network may be the only option.

For VMs launched using QEMU, NAT network is the automatically configured option, suitable when VMs need network access but don’t need to act as servers. If you choose NAT-based network access, you can skip this section. However, if your VMs serve as network servers and need to be accessed via public IP addresses, you need to set up a bridge according to the following steps.

Since Ubuntu 18.04 LTS, Ubuntu systems have switched to using netplan as the default network configuration method. The following example assumes your host server uses netplan and has one or more static IP addresses.

We edit the YAML file in /etc/netplan and add a bridges section. A complete example file might look like this:

network:
  version: 2
  renderer: networkd
  ethernets:
    eno1:
      addresses: [192.161.60.114/29,192.161.60.115/29,192.161.60.116/29]
      gateway4: 192.161.60.113
      nameservers:
        addresses:
          - 1.1.1.1
          - 8.8.8.8
  bridges:
    br0:
      addresses: [192.161.60.117/29,192.161.60.118/29]
      gateway4: 192.161.60.113
      interfaces:
        - eno1
      parameters:
        stp: true

In this example, the server has five IP addresses: 192.161.60.114, 192.161.60.115, 192.161.60.116, 192.161.60.117, and 192.161.60.118. We’ve set up a virtual network interface (bridge) using IPv4 addresses 192.161.60.117/192.161.60.118. This allows all created virtual machines to use the two IPv4 addresses associated with this single virtual bridge. The interface has STP enabled. Apply these changes using netplan apply. Note that if you define 192.161.60.117/192.161.60.118 to eno1, the server itself will respond to requests to these IP addresses preemptively.

Next, we configure libvirt to use the created bridge. First, we delete the default network configuration:

virsh net-undefine default

Then we create a temporary file named host-bridge.xml with the following content:

<network>
  <name>host-bridge</name>
  <forward mode="bridge"/>
  <bridge name="br0"/>
  <model type='virtio'/>
</network>

Apply the bridge configuration to libvirt:

virsh net-define host-bridge.xml
virsh net-start host-bridge
virsh net-autostart host-bridge

You can verify that these configurations are correct by using the command virsh net-list --all.

For security and performance reasons, we recommend disabling netfilter for bridged networks on the host system. Create a new file named /etc/sysctl.d/bridge.conf and fill it with the following content:

net.bridge.bridge-nf-call-ip6tables=0
net.bridge.bridge-nf-call-iptables=0
net.bridge.bridge-nf-call-arptables=0

Create a new file named /etc/udev/rules.d/99-bridge.rules and fill it with the following content:

ACTION=="add", SUBSYSTEM=="module", KERNEL=="br_netfilter", RUN+="/sbin/sysctl -p /etc/sysctl.d/bridge.conf"

Restart the system to apply these changes.

It’s important to note that if for some reason bridging is not feasible, consider using routed networking. Details can be found here.

Create VM

So far, we’ve completed all the configurations for virtualization and networking. Now, let’s use the virtual machine front-end management tool, virt-install, to create a VM. virt-install will automatically select the appropriate hypervisor program:

virt-install --name vm1 --ram=2048 --disk size=10 --vcpus 1 --os-type linux --os-variant ubuntu20.04 --graphics none --location 'http://archive.ubuntu.com/ubuntu/dists/focal/main/installer-amd64/' --extra-args "console=tty0 console=ttyS0,115200n8" --network bridge=br0,model=virtio

The default unit for the --disk size option is GiB. The --location option specifies the installation path, which can be a local ISO or CD-ROM file, or an online distribution installation directory, as shown in the example above. The --network bridge option specifies the name of the bridge network interface to use. Various images of Ubuntu distributions suitable for VMs can be found from cloud-images.ubuntu.com. If the server host has multiple IP addresses (typically not configured using DHCP), you may need to manually specify the IP address of the virtual machine itself and other network configuration options during installation.

If you want to boot the VM using a downloaded .img format image (similar to cloud service providers), you’ll need to perform some additional steps. First, most downloaded .img format images are actually in qcow2 format. We can verify this using the qemu-img command:

$ qemu-img info ubuntu-20.04-minimal-cloudimg-amd64.img
image: ubuntu-20.04-minimal-cloudimg-amd64.img
file format: qcow2
virtual size: 2.2 GiB (2361393152 bytes)
disk size: 199 MiB
cluster_size: 65536
Format specific information:
compat: 0.10
refcount bits: 16

The default image space is quite small. We can dynamically adjust the size of the disk image using the following command:

qemu-img resize ubuntu-20.04-minimal-cloudimg-amd64.img 25G

Editing Cloud Config

Cloud-init is an industry-standard for initializing tasks across various distribution images. This file describes some behaviors for initializing a distribution image as a VM instance. A user-friendly tutorial contributed by Justin Ellingwood is available on the DigitalOcean tutorial site. A simpler cloud-init file looks like this:

#cloud-config
password: ubuntu
chpasswd: { expire: False }
ssh_pwauth: True
hostname: ubuntu

This example creates a password for the default user (ubuntu) with the same name and enables SSH password login. If you’re interested in the format of this configuration file, please refer to the link above.

The cloud-localds utility, located in the cloud-image-utils package, is used to generate the corresponding .iso initialization image from the given cloud-init formatted text file.

cloud-localds /var/lib/libvirt/images/install.iso cloud.txt

In this case, /var/lib/libvart/images/ is the default path where the virsh utility saves VM disks. Therefore, our virt-install command would be:

virt-install --name vm1 --ram=2048 --vcpus 1 --os-type linux --os-variant ubuntu20.04 --graphics none --disk /var/lib/libvirt/images/ubuntu.img,device=disk,bus=virtio --disk /var/lib/libvirt/images/install.iso,device=cdrom --network bridge=br0,model=virtio --import

The first time you log in to the VM via TTY, you can set up the network configuration and SSH configuration to support remote login. Use the key combination ^] to exit TTY. If the configuration is correct, you can manage your VM via SSH and other methods.

For reference, here’s an example of configuring a static IP address using netplan in the guest operating system:

network:
    ethernets:
        enp1s0:
            addresses: [x.x.x.x/x]
            gateway4: x.x.x.x
            nameservers:
                addresses: [1.1.1.1, 8.8.8.8]
            match:
                macaddress: xx:xx:xx:xx:xx:xx
            set-name: enp1s0
    version: 2

It’s worth noting that for Ubuntu distribution Cloud Image images, the “minimal” image series does not support cloud-init configurations. Please use the full version images suitable for cloud computing.

Ongoing Maintenance

Here are some common commands used to maintain VMs. A complete list of commands can be found in virsh(1).

CommandDescription
virsh list -allList all VMs
virsh dominfo <vm>Retrieve basic information for VM
virsh start <vm>Start VM
virsh console <vm>Invoke console for VM
virsh shutdown <vm>Shutdown VM
virsh autostart <vm>Auto start VM when host started
virsh undefine <vm>Remove VM

Since I had never done anything like this before, it took me a few days to fully understand and implement virtualization. I basically encountered every single technical issue mentioned in the article. The process of hitting bumps in the road is both painful and requires interest to keep going, but when you finally succeed, it’s as exciting as achieving another milestone in life.

– @DGideas