Open Source Meets The Cloud
The enterprise, government, startups and the system integrators are keenly watching the space of cloud computing. While it is a fact that only companies with deep pockets like Microsoft, Google and Amazon can get into the business of providing the cloud infrastructure, the contribution from other companies is absolutely critical for the cloud to become viable and real.
The biggest concern and risk involved in moving to the cloud is security, privacy and reliability. Most of the customers would want to know what lies beneath the cloud platform before betting their business on it. Making the cloud transparent, open and interoperable will enable better adoption by businesses. This is where the Free and Open Source Software (FOSS) will act as a catalyst. The CIOs would want their teams to setup and play with the same stack before actually moving to the Cloud. Knowing that it is the same software that powers their infrastructure at some unknown corner of world will bring a level of comfort to the skeptical decision makers.
The private cloud Computing offerings are now available from VMWare, Microsoft, IBM and others. But it is still not clear whether these are the same offerings that power the commercial public cloud offered by the Cloud providers. For example, Microsoft categorically mentions that Windows Azure, their cloud OS is not available as a retail OS that the customers can implement in their data centers. It is also unclear whether Microsoft’s private cloud offering based on Windows Server 2008 Hyper-V is the same that is powering its public cloud, the Azure platform.
Before looking at the critical contribution of FOSS to the Cloud, it’s important to understand the typical cloud computing architecture. At the heart of the cloud is virtualization. To bring the elasticity nature to the cloud, Virtual Machines (VMs) should be dynamically added and removed on-demand. To manage these VMs efficiently, we need a special piece of software that is called the Hypervisor.
These Hypervisors can be added to the existing Operating Systems. Some of the modern Server OSs come with the Hypervisor built into them. Ubuntu Server, Red Hat Enterprise Linux ship with KVM and Microsoft’s Windows Server 2008 Hyper-V Edition has a Hypervisor built into it. VMWare, the pioneer in virtualization has some of its Hypervisors like ESX 3.x which are Open Source. The most popular Hypervisor is an Open Source implementation called Xen. Xen is already shipped with the Linux Server editions of SUSE, Debian and few flavors of Red Hat Enterprise Linux. Because the VMs are the workhorses of the Cloud and Virtualization, the OS is just limited to booting up and running the Hypervisor.
To avoid the overhead of the OS, Hypervisors are now shipped as standalone layers which do not need a separate OS to boot. Most of these standalone Hypervisors are wrapped within Embedded Linux. They make the OS completely redundant and can also boot from a USB flash disk. It is hard to imagine this architecture without Linux and OSS forming the core. Though Microsoft had made its Hypervisor, Hyper-V Server free, you still need to invest in costly management software, Microsoft System Center Virtual Machine Manager (SCVMM) to administer, manage and monitor the VMs. The most successful commercial cloud implementation, Amazon Web Services (AWS) run by Amazon is powered by the OSS Virtualization platform based on Xen.
After looking at the OSS Hypervisors, the next layer in the stack are the VMs. VMs are just the virtualized instances of the typical servers that run in the data center. These server VMs represent the messaging, database, collaboration & portal, web and application servers. LAMP is undoubtedly the most popular stack that is powering widely used applications on the web including Facebook. When a user signs up with Amazon EC2 to run the server instances in the cloud, there is a huge collection of Amazon Machine Images (AMIs) built on Linux and FOSS and that are available at a nominal price. Remember that users only pay Amazon for the computing power and storage that is consumed and there is no need to worry about the license fee of the software that you use within these AMIs. For other commercial software, Amazon assumes that users have a valid license to run the software and the users are completely liable and accountable for the software licenses.
Apart from the custom Line-of-Business (LOB) applications on LAMP, there are some really powerful frameworks built on OSS. A significant part of the web today runs on Open Source Content Management System (CMS) frameworks like WordPress, Drupal and Joomla. To really exploit the power of on-demand availability of computing power, Apache foundation has released Hadoop. Hadoop is an Open Source framework to process huge datasets leveraging the computing power from dozens of servers. Hadoop enabled The New York Times to convert 4TB of raw TIFF data to an indexed, search-able digital archive of 11 million PDFs in just 24 hours costing them only $240! That is just unimaginable and only demonstrates what open source and the cloud can achieve together.
The other area that is gaining ground is the private cloud. Private cloud promises the benefits of the cloud computing while running in the enterprise data centers that are secured behind the firewall. The most popular private cloud implementation comes from Eucalyptus Systems. This was started as a research project by the Computer Science Department at the University of California, Santa Barbara before it was distributed through Ubuntu Server by Canonical that promotes Ubuntu and other OSS.
When the enterprise and the government want to have a combination of private cloud and public cloud to form the hybrid cloud, these OSS implementations will be really handy. Sensitive customer/patient/citizen data can reside on the private cloud thus respecting the privacy and adhering to the local regulations but can leverage the power of cloud by moving the non-sensitive, compute intensive tasks to a public cloud.