Operating system-level virtualization

From Wikipedia, the free encyclopedia

Jump to: navigation, search

Operating system-level virtualization is a server virtualization method where the kernel of an operating system allows for multiple isolated user-space instances, instead of just one. Such instances (often called containers, VEs, VPSs or jails) may look and feel like a real server, from the point of view of its owner. On Unix systems, this technology can be thought of as an advanced implementation of the standard chroot mechanism. In addition to isolation mechanisms, the kernel often provides resource management features to limit the impact of one container's activities on the other containers.

Contents

[edit] Uses

Operating system-level virtualization is commonly used in virtual hosting environments, where it is useful for securely allocating finite hardware resources amongst a large number of mutually-distrusting users. It is also used, to a lesser extent, for consolidating server hardware by moving services on separate hosts into containers on the one server.

Other typical scenarios include separating several applications to separate containers for improved security, hardware independence, and added resource management features.

OS-level virtualization implementations that are capable of live migration can be used for dynamic load balancing of containers between nodes in a cluster.

[edit] Advantages and disadvantages

[edit] Overhead

This form of virtualization usually imposes little or no overhead, because programs in virtual partition use the operating system's normal system call interface and do not need to be subject to emulation or run in an intermediate virtual machine, as is the case with whole-system virtualizers (such as VMware and QEMU) or paravirtualizers (such as Xen and UML). It also does not require hardware assistance to perform efficiently.

[edit] Flexibility

Operating system-level virtualization is not as flexible as other virtualization approaches since it cannot host a guest operating system different from the host one, or a different guest kernel. For example, with Linux, different distributions are fine, but other OS such as Windows cannot be hosted. This limitation is partially overcome in Solaris Containers by its branded zones feature, which provides the ability to run an environment within a container that emulates a Linux 2.4-based release or an older Solaris release.

[edit] Storage

Some operating-system virtualizers provide file-level copy-on-write mechanisms. (Most commonly, a standard file system is shared between partitions, and partitions which change the files automatically create their own copies.) This is easier to back up, more space-efficient and simpler to cache than the block-level copy-on-write schemes common on whole-system virtualizers. Whole-system virtualizers, however, can work with non-native file systems and create and roll back snapshots of the entire system state.

[edit] Example restrictions inside the container

The following actions are often prohibited:

  • Modifying the running kernel by direct access and loading kernel modules.
  • Mounting and dismounting file systems.
  • Creating device nodes.
  • Accessing raw, divert, or routing sockets.
  • Modifying kernel runtime parameters, such as most sysctl settings.
  • Changing securelevel-related file flags.
  • Accessing network resources not associated with the container.

[edit] Implementations

Mechanism Operating system License Release date Features
File system isolation Disk quotas I/O rate limiting Memory limits CPU quotas Network isolation Partition checkpointing
and live migration
chroot most UNIX-like operating systems Proprietary

BSD

GNU GPL CDDL

1982 Yes/No[1] No No No No No No
FreeVPS Linux GNU GPL - Yes Yes No Yes Yes Yes No
iCore Virtual Accounts Windows XP Proprietary 06/2008 Yes Yes No No No Yes No
Linux-VServer
(security context)
Linux GNU GPL v.2 - Yes Yes Yes/No [2] Yes Yes Yes No
OpenVZ
(virtualization, isolation and resource management)
Linux GNU GPL v.2 - Yes Yes Yes [3] Yes Yes Yes[4] Yes
Proxmox_Virtual_Environment
(Includes OpenVZ for OS and KVM for full virtualization)
Linux GNU GPL v.2 04/2008 Yes Yes Yes Yes Yes Yes Yes
Parallels Virtuozzo Containers Linux, Windows Proprietary - Yes Yes Yes [5] Yes Yes Yes[4] Yes
Container/Zone Solaris CDDL 01/2005 Yes Yes No Yes Yes Yes[4] No[6]
FreeBSD Jail FreeBSD BSD 03/2000 Yes No No No No Yes No
sysjail OpenBSD, NetBSD BSD - Yes No No No No Yes No
WPARs AIX Proprietary 10/2007 Yes - - Yes Yes - Yes


[edit] Notes

  1. ^ Root user can easily escape from chroot. Chroot was never supposed to be used as a security mechanism.
  2. ^ Utilizing the CFQ scheduler, you get a separate queue per guest. Actually, the I/O queue is a per-process, not per-guest. So containers can still have arbitrary amount of disk I/O.
  3. ^ Available since kernel 2.6.18-028stable021. Implementation is based on CFQ disk I/O scheduler, but it is a two-level schema, so I/O priority is not per-process, but rather per-container. See OpenVZ wiki: I/O priorities for VE for details.
  4. ^ a b c Network is not isolated, but rather virtualized, meaning each virtual environment can have its own IP addresses, firewall rules, routing tables and so on. Solaris Containers have isolated networks when a dedicated NIC is assigned to the container ("exclusive IP") or if using the IP Instances capability currently available in OpenSolaris. See OpenSolaris Network Virtualization and Resource Control and Network Virtualization and Resource Control (Crossbow) FAQ for details.
  5. ^ Available since version 4.0, January 2008.
  6. ^ Cold migration (shutdown-move-restart) is implemented.

[edit] External links

Personal tools
Languages