Friday, December 21, 2007

Virtualization - Whats the big deal with booting an OS inside an OS!!

This was a question from a friend of mine. Whats the big deal about booting Linux inside Windows or Windows inside Linux or for that matter booting/running any Operating System inside an other Operating System ?

Its a big big deal.. believe me.. and let me explain how.

But before I answer this question, let me explain what Virtualization is in layman's terms (for the sake of those who don't know what it is).

Virtualization is a general word which means that you emulate and present something while keeping hidden the actual resources in the background. In the context of this post, by Virtualization, I mean Operating System virtualization.

OS Virtualization allows you to actually boot an operating system (its called as a Guest OS) inside an Operating System (lets call this as the Host OS). While the Host OS uses the actual physical resources such as physical memory, the physical processor and physical storage (well.. this is not actually true because storage can again be virtualized), the Guest OS runs like just another application inside the Host OS.

But the Guest OS is made to believe that it is actually accessing physical resources such as Processor, RAM, Interrupt Controller, etc but in reality all these are emulated by a software layer called as Virtualization Managers or Virtual Machine Monitor (VMM) or better known as Hypervisor (cool word isn't it ;) - sounds like a devices' name from Star Trek!! ). Host OS doesn't have a problem with this as it sees the Guest OS as an application. Now you can have as many instances of Guest OS' running inside the Host OS (only limited by the processing capacity of the Host OS).

Virtualization is of different types - Emulation, Para-virtualization, Native Virtualization, OS-Level Virtualization and Application-level virtualization.

You can quickly learn more about the different types from the links here and here.

Now let me answer as to what's the big deal about Virtualization.

I have used virtualization (like booting Linux from within windows) for petty and stupid reasons such as not having permissions in the organization to use Linux boxes. Virtualization would be the only way out.

But Virtualization is being used for more useful application in building data centers.

Data centers are simply a group of servers which host a customers data inside it. The servers may be used individually or in groups (also known as clusters) to form a cloud of servers which work as one (this is better known as cloud-computing). Data centers are sometimes known as server farms.

In today's world every big organization maintains a data centers one way or other. Data centers are expensive to maintain and scale. Adding servers is expensive and you may still end up having may servers which may not be utilizing even half the computing power.

Also, data centers have this problem wherein upgrading the servers involves lot of effort. Let me explain. When you want to upgrade the servers, the administrators (yes its plural.. lot of people would have to be involved) would have to undergo the nightmare of ensuring that all the data and services are available/migrated on to a new set of servers before they can upgrade the already existing hardware. An easy way out for the administrators is to have redundancy. Which means that data and services would have to be mirrored (copied) at multiple locations. Now this only complicates the situation and cost. You have the added cost of these additional mirror servers (so the cost goes even higher) and you also need a SAN (storage area network) expert whose job is to ensure that data mirroring happens without problems.

This is where virtualization technology comes to rescue. Using virtualization, you can run multiple guest OS' on a single server. Each guest OS is a server.. a virtual machine. You can run multiple virtual machines until a point where computing power of the Host OS (physical server) is used very efficiently. The server which was running, say, at less than half the processing power before virtualization would now run at close to 100% of the processing power which multiple virtual machines sharing the processing power equally.

These virtual machines are controlled and configured using a virtual machine monitor (VMM) a.k.a Hypervisor. The VMM has the capability to change the configuration of the simulated resources which the virtual machines use. For example, the you can add additional processors for each of these virtual machines. For example, if you had configured the virtual machine to use a dual-processor, now using a VMM, you can add 2 more processors by doing few clicks and now you have your virtual machine running on a quad-processor configuration. This you can do as and when you find that the processing power is available. Additional processing power would be available either because you have stopped or moved a virtual machine to a different physical server or the existing physical server was upgraded with more processing capacity.

It is equally easy to upgrade the physical servers on which these virtual machines execute. All you would have to do is to save the state of these virtual machines (on to a file) and copy these saved instances on to a new server and then start these virtual machines.

Virtualization software have become more complex than what I have explained above. I will keep updating this post with the following:
- The recent virtualization techniques used by different VMMs such as VMware, Xen, Hyper-V, etc.
- The recent trends and competition between different Virtualization companies
- Finally a word on Linux kernel's in-built support for virtualization.
- References to other web sites which give more detailed and reliable information about Virtualization.

No comments: