Google Cloud Platform for AWS Professionals - Part 3
In the last part of this series, we discussed the fundamentals of Google Compute Engine. Let’s understand the block storage of GCE.
Ephemeral vs Persistent Disks
If you have been an early adopter of Amazon EC2, you must be familiar with Instance Store and Elastic Block Store. When Amazon EC2 was launched, each AMI was stored on Amazon S3 as a self-contained entity that had the configuration and the bundled data. That changed with EBS-backed AMIs which offered root file system on a persistent EBS volume. This brought the benefits of stopping and starting an instance and rapid scaling-up of running instances. Carrying the legacy of EC2, AWS still offers Ephemeral Storage that can be attached to an instance during the boot time. Because the Ephemeral Storage comes from the same host on which the VM is running, it offered better I/O than the standard EBS volumes.
Like most of the platform companies, Amazon is forced to retain the original concepts for backward compatibility. To overcome the limitations of ephemeral storage, it had to launch EBS. But EBS had its own share of problems in terms of performance and I/O throughput. To offer committed I/O, AWS launched the Provisioned IOPS for EBS where customers can buy a set of read and write operations along with storage. Google doesn’t have to deal with this legacy. So, it started with Persistent Disks with enhanced I/O. GCE got rid of the ephemeral storage in favor of persistent block storage. Bottom line, there is no concept of Ephemeral Storage in GCE. Persistent Disks on GCE deliver same performance as the Ephemeral Storage.
Storage Performance
AWS customers with high disk I/O requirements have to choose from a variety of options. You can start by attachingan EBS volume with PIOPS, switch to an EBS-Optimized Amazon EC2 Instance or leverage the storage-optimized Instance family that comes with SSD based Ephemeral Storage. Each option has its own pros and cons in terms of performance and price. When using PIOPS based EBS, you have to do the complex math to find out the average read and write operations performed by the workload. How much ever planning goes into it, you may not accurately predict the IOPS. Since the cost of the PIOPS EBS volume is based on the IOPS, this always remains a challenge. To get the best out of the PIOPS EBS volume, it needs to be attached to an EBS-Optimized EC2 Instance which is slightly costlier than its regular counterpart. The storage-optimized family comes with SSD but it is only for the ephemeral storage forcing us to attach EBS volumes.
On GCE, it is easier to make the choices to get the right disk I/O. The Persistent Disk doesn’t differentiate between regular and PIOPS volumes and doesn’t charge separately for predictable performance. Firstly, larger the disk, better the I/O. Second, the instance type impacts the disk I/O by throttling the performance. So, choosing the right combination of VM instance type and the disk size will give you the desired performance.
Shared Storage
Amazon EBS volumes can be attached to one instance at a time. This makes it hard to implement shared storage on EC2. To overcome this issue, many customers are forced to implement NFS or Gluster. This keeps the shared content across the instances in sync. GCE’s Persistent Disk can be attached to multiple instances running within the same zone. This avoids the cost of running and maintaining the shared file system.
Disk Size
The maximum size supported by EBS is 1TB.GCE Persistent Disks can be up to 10TB. Like EBS, PD’s performance is proportional to the size of the disk.
Snapshots
There is a lot of similarity between Amazon EBS snapshots and GCE PD snapshots. GCEcreates differential snapshots, which allow for better performance and lower storage charges. Like the way EBS snapshots are stored in Amazon S3, PD snapshots are stored in Google Cloud Storage.But, the major difference between EBS snapshots and PD snapshots is at the scope. While EBS snapshots are specific to an AWS region, PD snapshots are global. What this means to the customers? They need not deal with the manual copy of snapshots across regions. For example, to export an EBS volume from US-EAST to US-WEST, you have to create a snapshot in US-EAST and then initiate the copy process to US-WEST to restore that. As discussed in part 2, GCE has a global scope that is independent of regions. PD snapshots are global enabling customers to easily create PDs in any region.
In the next part of the article, we will explore some more features of Google Compute Engine. Stay tuned!