Virtualization and cloud technologies

Jakub Klinkovský

:: Czech Technical University in Prague
:: Faculty of Nuclear Sciences and Physical Engineering
:: Department of Software Engineering

Academic year 2024-2025

What is a "cloud"?

What is the cloud?

What is the cloud?

Let's try an abstract definition:

In the context of computing, cloud is a specific network of resources that can be accessed over the internet and used to store, manage, and process data.

Cloud is about technologies, but in the modern sense, the term is closely related to various business models. Note that the users of cloud may not be individual people, but companies.

Origins of the word

  • The ☁️ symbol was often used in diagrams and flowchars to depict a network or the Internet. It symbolizes a vast, amorphous, and interconnected system that can be accessed from anywhere.
  • The term cloud computing as we know it today started to gain prominence around 2006, when Amazon launched the Amazon Web Services (AWS), which included the Elastic Compute Cloud (EC2). This was one of the earliest known uses of the term in a commercial context.
  • Since then, the term cloud has become widely adopted and is now a fundamental concept in the technology industry, representing a wide range of services.
  • Note that any technology usually becomes wide-spread only several years after its invention.

Example: what is needed to deploy a website

Let's say you have some money and want to make more money.
You have a product – code of a super cool new website, text content, multimedia, etc.
What do you need to deliver it to your users and how?

  • hardware – data processing resources and storage capacity
  • internet connectivity – public IP address and domain
  • software environment – operating system, database system, etc.
  • security – authentication system, payment gate, etc.
  • applications – components of your website
  • content – text and multimedia that users want

Cloud service models

Comparison of on-premise and typical cloud service models (NIST 2011):

center

Details of cloud service models

  • On-premise: you (or your company) has to manage everything
  • Infrastructure as a service (IaaS):
    • hardware, electricity, storage, and networking is managed by a provider
    • the customer (you or your company) manages the full software stack (including the operating system)
  • Platform as a service (PaaS):
    • the provider manages everything needed to run a specific type of applications
      (i.e., the platform)
    • for example, Google App Engine supports applications written in Go, PHP, Java, Python, Node.js, .NET, and Ruby
  • Software as a service (SaaS): the customer cannot run arbitrary software, but is responsible only for the management of content and users

Types of cloud computing

There are four types of cloud computing:

  1. Public cloud – infrastructure is not owned by the end user, but provided by a public provider (the largest ones are AWS, Google Cloud, IBM Cloud, and Microsoft Azure)
  2. Private cloud – cloud environments solely dedicated to a single customer, typically runs behind that customer's firewall.
  3. Hybrid cloud – a seemingly single environment created from multiple environments connected through various networks (LAN, WAN, VPN) and/or APIs.
    • E.g. private + public, private + private, public + public
  4. Multicloud – the use of multiple separate cloud environments, e.g. to improve security and/or performance.

Essential cloud technologies

Every cloud is built using two fundamental technologies:

  • Internet
  • Virtualization

Moreover, cloud allows to create more modern technologies, particularly software:

Virtualization

Virtualization is the process of simulating some real effect, condition, or object with some imitated alternative.

Virtualization in computing

In the context of computing, virtualization means creating a virtual version of common computing hardware at the same abstraction level.

center

Virtual machine components

  • Virtual CPU: an imitation of a physical CPU, allowing multiple virtual machines to share the same physical CPU resources.
    • The number of cores and speed of the virtual CPU can be configured.
  • Virtual RAM: an allocation of physical RAM that is dedicated to a virtual machine.
    • The size of the virtual RAM can be configured.
  • Virtual disk: a file or set of files that represents a physical disk drive to the guest operating system.
    • The size, file format and storage location in the host OS can be configured.
  • Virtual network interface: a software emulation of a physical network interface card (NIC).
    • The interface type, communication protocols and topology can be configured.

Why virtualization?

  • Division of real resources – increased flexibility and efficiency
  • Isolation of virtual resources – increased security

Hardware virtualization

Hardware virtualization means creating virtualized hardware environment for running multiple virtual machines (VMs) on a single physical machine.

There are two approaches: emulation and hardware-assisted virtualization:

  • Emulation: Software-only approach, where the virtualization software (hypervisor) emulates the hardware environment for each VM, translating guest OS instructions into host OS instructions.
    • Any hardware can be emulated (e.g. old or exotic platforms), but it is slow.
  • Hardware-assisted virtualization: The hypervisor can use specialized hardware features, such as processor instructions (Intel VT-x or AMD-V), to assist the virtualization process, providing a more efficient and secure way to run VMs.
    • The virtual hardware has the same architecture as the physical hardware.

Hardware virtualization levels

The following modes are used to categorize virtualization technologies based on their level of hardware virtualization and guest OS modification:

  • Full virtualization
  • Para-virtualization
  • Hybrid virtualization

Note: These modes are not mutually exclusive, in practice many solutions are hybrid to achieve optimal performance and compatibility.

Full virtualization

  • The hypervisor provides a complete, virtualized hardware environment for each guest OS
  • Guest OS is not aware that it is running on a virtual machine and requires no modifications
  • Examples: VMware, VirtualBox, KVM

Para-virtualization

  • The guest OS is modified to be aware of the virtualized environment and communicates directly with the hypervisor
  • Guest OS requires modifications to run on the virtual machine
  • Examples: Xen (with modified Linux and Windows guests)

Hybrid virtualization

  • Combines elements of full virtualization and para-virtualization
  • The hypervisor provides a virtualized hardware environment, but the guest OS can also communicate with the hypervisor for improved performance
  • Examples: Xen (with unmodified Windows guests, using a special driver), KVM with para-virtualized drivers (e.g. graphics, block devices, or network)

Hardware virtualization types

Virtualization can be categorized into two main types based on the location and architecture of the hypervisor:

center

Characteristics

  • Type-1 Virtualization (Bare-Metal):
    • Hypervisor runs directly on the host machine's hardware
    • No underlying operating system
    • Direct access to hardware resources
    • Examples: VMware ESXi, Microsoft Hyper-V, KVM
  • Type-2 Virtualization (Hosted):
    • Hypervisor runs on top of an existing host operating system
    • Host OS manages hardware resources
    • Hypervisor runs as an application on top of the host OS
    • Examples: VMware Workstation, VirtualBox, Parallels Desktop

Key differences

  • Performance: Type-1 generally offers better performance due to direct access to the underlying hardware
  • Complexity: Type-2 is often easier to install and manage, as it runs on top of an existing OS
  • Security: Type-1 can provide better security, as the hypervisor has direct control over hardware resources

Note: The choice of virtualization type depends on the specific use case, such as server virtualization, desktop virtualization, or development environments.

Note: emulation vs hardware-assisted virtualization, full vs para-virtualization, and Type-1 vs Type-2, are three independent concepts.

Ask your AI: Is Xen Type-1 or Type-2?

Xen is a bit of a special case. Xen can be both Type-1 and Type-2, depending on the configuration:

  • Xen (Type-1): When Xen is installed directly on the bare metal, without an underlying operating system, it acts as a Type-1 hypervisor. This is the most common configuration for Xen.
  • Xen (Type-2): However, Xen can also be installed on top of a Linux operating system, such as Ubuntu or CentOS, using a package called "Xen Tools" or "Xen Server". In this case, Xen runs as a Type-2 hypervisor, on top of the host Linux OS.

So, Xen is primarily a Type-1 hypervisor, but it can also be used as a Type-2 hypervisor in certain configurations.

Ask your AI: Is KVM Type-1 or Type-2?

KVM (Kernel-based Virtual Machine) is a bit of a special case as well, because it runs as a module within the Linux kernel.

  • Arguments for KVM being Type-1:
    • Runs directly on the host machine's hardware
    • Linux kernel serves as a platform for KVM, rather than a traditional host OS
  • Arguments for KVM being Type-2:
    • Relies on the Linux kernel to manage hardware resources
    • Linux kernel acts as a layer between KVM and the hardware, providing its device drivers and interfaces to interact with the hardware

The distinction between Type-1 and Type-2 hypervisors can be blurry, and different interpretations are possible depending on the context and criteria used.

Limitations of hardware virtualization

While virtualization offers many benefits, it also has some limitations:

  • Performance overhead due to emulation and/or context switching
  • Complexity in managing and configuring virtualized environments
    (need to manage complete operating systems)
  • Limited support for certain hardware devices or legacy systems

These limitations have led to the development of alternative technologies that are lightweight and efficient, but limited in terms of isolation and security. They are known as operating system virtualization or containerization.

Operating system virtualization

Operating system virtualization or containerization is a lightweight and efficient way to deploy and manage application-level software:

  • Containers are isolated environments that run on a single host operating system.
  • All containers share the same kernel with the host operating system, but have their own user space (application-level software).
  • Containers are portable and can be easily moved between hosts, without requiring a specific environment or dependencies.

Traditionally operating systems have a fixed combination of kernel and user space. Containers change the user space into a swappable component. This means that the entire user space portion of an operating system, including programs and custom configurations, can be independent of the host operating system.

Hardware virtualization vs OS-virtualization

Hardware virtualization Operating system virtualization
Virtualization level
(resource allocation)
Allocates hardware resources (CPU, RAM, disk, etc.) to virtual machines Allocates OS resources (processes, memory, file systems, etc.) to isolated environments
Guest OS Can run multiple, independent operating systems Runs isolated environments in a single operating system with a shared kernel and hardware
Performance Higher overhead due to separate operating systems Lower overhead (no emulation, shared operating system)
Security Provides strong isolation and security between virtual machines Provides process-level isolation, but may not be as secure as hardware virtualization
Examples VMware, KVM, Xen, etc. Docker, Linux Containers, OpenVZ, etc.

Containerization architecture

Containerization architecture