OpenVZ

From Wikipedia, the free encyclopedia

Jump to: navigation, search
OpenVZ
Developed by Community project,
supported by Parallels
Operating system Linux
Platform x86, x86-64, IA-64, PowerPC, SPARC
Type OS-level virtualization
License GNU GPL v.2
Website http://openvz.org

OpenVZ is an operating system-level virtualization technology based on the Linux kernel and operating system. OpenVZ allows a physical server to run multiple isolated operating system instances, known as containers, Virtual Private Servers (VPSs), or Virtual Environments (VEs). It's similar to FreeBSD Jails and Solaris Zones.

As compared to virtual machines such as VMware and paravirtualization technologies like Xen, OpenVZ is limited in that it requires both the host and guest OS to be Linux (although Linux distributions can be different in different containers). However, OpenVZ claims a performance advantage; according to its website, there is only a 1–3% performance penalty for OpenVZ as compared to using a standalone server.[1] One independent performance evaluation[2] confirms this. Another shows more significant performance penalties[3] depending on the metric used.

OpenVZ is a basis of Parallels Virtuozzo Containers, a proprietary software product provided by Parallels, Inc. OpenVZ is licensed under the GPL version 2. OpenVZ project is supported and sponsored by Parallels.

The OpenVZ is divided into a custom kernel and user-level tools.

Contents

[edit] Kernel

The OpenVZ kernel is a Linux kernel, modified to add support for OpenVZ containers. The modified kernel provides virtualization, isolation, resource management, and checkpointing.

[edit] Virtualization and isolation

Each container is a separate entity, and behaves largely as a physical server would. Each has its own:

Files
System libraries, applications, virtualized /proc and /sys, virtualized locks etc.
Users and groups
Each container has its own root users, as well as other users and groups.
Process tree
A container only sees its own processes (starting from init). PIDs are virtualized, so that the init PID is 1 as it should be.
Network
Virtual network device, which allows a container to have its own IP addresses, as well as a set of netfilter (iptables) and routing rules.
Devices
If needed, any container can be granted access to real devices like network interfaces, serial ports, disk partitions, etc.
IPC objects
Shared memory, semaphores, messages.

[edit] Resource management

OpenVZ resource management consists of three components: two-level disk quota, fair CPU scheduler, and user beancounters. These resources can be changed during container runtime, eliminating the need to reboot.

[edit] Two-level disk quota

Each container can have its own disk quotas, measured in terms of disk blocks and inodes (roughly number of files). Within the container, it is possible to use standard tools to set UNIX per-user and per-group disk quotas.

[edit] CPU scheduler

The CPU scheduler in OpenVZ is a two-level implementation of fair-share scheduling strategy.

On the first level, the scheduler decides which container it is to give the CPU time slice to, based on per-container cpuunits values. On the second level the standard Linux scheduler decides which process to run in that container, using standard Linux process priorities and such.

It is possible to set different values for the cpus in each container. Real CPU time will be distributed proportionally to these values.

Strict limits, such as 10% of total CPU time, are also possible.

[edit] I/O scheduler

Similar to the CPU scheduler described above, I/O scheduler in OpenVZ is also two-level, utilizing Jens Axboe's CFQ I/O scheduler on its second level.

Each container is assigned an I/O priority, and the scheduler distributes the available I/O bandwidth according to the priorities assigned. Thus no single container can saturate an I/O channel.

[edit] User Beancounters

User Beancounters is a set of per-container counters, limits, and guarantees. There is a set of about 20 parameters which is meant to control all the aspects of container operation. This is meant to prevent a single container from monopolizing system resources.

These resources primarily consist of memory and various in-kernel objects such as IPC shared memory segments, and network buffers. Each resource can be seen from /proc/user_beancounters and has five values associated with it: current usage, maximum usage (for the lifetime of a container), barrier, limit, and fail counter. The meaning of barrier and limit is parameter-dependent; in short, those can be thought of as a soft limit and a hard limit. If any resource hits the limit, the fail counter for it is increased. This allows the owner to detect problems by monitoring /proc/user_beancounters in the container.

Values in User Beancounter
Value Meaning
lockedpages The memory not allowed to be swapped out (locked with the mlock() system call), in pages.
shmpages The total size of shared memory (including IPC, shared anonymous mappings and tmpfs objects) allocated by the processes of a particular VPS, in pages.
privvmpages The size of private (or potentially private) memory allocated by an application. The memory that is always shared among different applications is not included in this resource parameter.
numfile The number of files opened by all VPS processes.
numflock The number of file locks created by all VPS processes.
numpty The number of pseudo-terminals, such as an ssh session, the screen or xterm applications, etc.
numsiginfo The number of siginfo structures (essentially, this parameter limits the size of the signal delivery queue).
dcachesize The total size of dentry and inode structures locked in the memory.
physpages The total size of RAM used by the VPS processes. This is an accounting-only parameter currently. It shows the usage of RAM by the VPS. For the memory pages used by several different VPSs (mappings of shared libraries, for example), only the corresponding fraction of a page is charged to each VPS. The sum of the physpages usage for all VPSs corresponds to the total number of pages used in the system by all the accounted users.
numiptent The number of IP packet filtering entries

[edit] Checkpointing and live migration

A live migration and checkpointing feature was released for OpenVZ in the middle of April 2006. This makes it possible to move a container from one physical server to another without shutting down the container. The process is known as checkpointing: a container is frozen and its whole state is saved to a file on disk. This file can then be transferred to another machine and a container can be unfrozen (restored) there; the delay is roughly a few seconds. Because state is usually preserved completely, this pause may appear to be an ordinary computational delay.

[edit] OpenVZ distinct features

[edit] Scalability

As OpenVZ employs a single kernel model, it is as scalable as the 2.6 Linux kernel; that is, it supports up to 64 CPUs and up to 64 GB[4] of RAM. (on 32-bit with PAE) A single container can scale up to the whole physical box, i.e. use all the CPUs and all the RAM.

[edit] Performance

The virtualization overhead observed in OpenVZ is limited, and is negligible in most scenarios.[2]

[edit] Density

OpenVZ density on a 768 MB RAM box

OpenVZ is able to host hundreds of containers on a decent hardware (the main limitations are RAM and CPU).

The graph shows relation of container's Apache web server response time on the number of containers. Measurements were done on a machine with 768 MB of RAM; each container was running usual set of processes: init, syslogd, crond, sshd and Apache. Apache daemons were serving static pages, which were fetched by http_load, and the first response time was measured. As the number of containers grow, response time becomes higher because of RAM shortage and excessive swapping.

In this scenario it is possible to run up to 120 such containers on a 768 MB of RAM. It extrapolates in a linear fashion, so it is possible to run up to about 320 such containers on a box with 2 GB of RAM.

[edit] Mass-management

An administrator (i.e. root) of an OpenVZ physical server (also known as a Hardware Node or host system) can see all the running processes and files of all the containers on the system. That makes mass management scenarios possible. Consider that VMware or Xen is used for server consolidation: in order to apply a security update to 10 virtual servers, an administrator is required to log in into each one and run an update procedure.

With OpenVZ, a simple shell script can update all containers at once.

[edit] See also

[edit] References

  1. ^ Official OpenVZ web site, http://openvz.org/
  2. ^ a b HPL-2007-59 technical report, http://www.hpl.hp.com/techreports/2007/HPL-2007-59R1.html?jumpid=reg_R1002_USEN
  3. ^ (Ottawa) Linux Symposium Proceedings, Volume I, July 2008, http://www.linuxsymposium.org/2008/ols-2008-Proceedings-V1.pdf
  4. ^ RAM size is specified using binary meanings for K (10241 instead of 10001), M (10242 instead of 10002), G (10243 instead of 10003), ...

[edit] External links

Personal tools