Cloud Storage Choices on Windows Azure

Many architects face the challenge of mapping and aligning the physical architecture with the Cloud architecture. This is primarily due to the wide array of choices on the Cloud and also due to some of the constraints introduced by the Cloud. Storage is one of the most critical factors influencing the availability, reliability and cost of a Cloud application. This article summarizes various Cloud storage choices that are offered by Windows Azure. The objective is to identify the key role played by each of the storage services and highlight the common use case for that service.

In the traditional environments, the storage choices are typically based on the following -

  • Direct Attached Storage
  • In-Memory
  • Message Queue
  • Storage Area Network
  • Network Attached Storage
  • Databases
  • Archival / Backup

However, on the Cloud, the terminology and the analogy of Storage is different from a classic on-premise storage terminology. For example, emulating SAN or NAS is difficult in the Cloud. But at the same time, Windows Azure offers a variety of storage choices to build web-scale applications on the Cloud. We will now take a look at these choices to understand the scenarios and use cases.

  • Local Storage
  • Azure Drive
  • Azure Blobs
  • Azure Tables
  • Azure Queues
  • Azure Cache
  • SQL Azure
  • Custom Databases

Local Storage - Every role instance comes with default storage attached to it and this is referred as local storage. Since this is an integral part of the instance, it offers high performance I/O operations and can be treated like a system or boot partition of a Windows machine. Developers can use the standard File I/O API to deal with the file system. Additional storage can be added to the instance by modifying the Service Definition file (*.CSDEF) of the Cloud application. The key thing to remember about the local storage is that it is not durable across the role life cycle. Once a role is terminated, the data in the local storage is lost along with the instance. So, typically local storage is used to move data that needs to be processed from an external durable storage engine like Azure Blob. You can think of it as a scratch disk.

Use Case - Considered to be an intermediary storage to process data that is stored separately on an external durable storage engine.

Azure Drive - To overcome the limitations of local storage, you can rely on Azure Drive. Think of Azure Drive as a pluggable storage device on the Cloud. You can request for an Azure Drive of a specific size, mount it, format it and use it like any other Windows drive. If you prefer, you can format with NTFS and enable encryption at the file system level. All the flushed data is automatically committed to the disk implicitly offering a durable storage option. Since the data is persisted to the Azure Drive which is external to a role instance, in case of an instance failure, the same drive can be attached to another healthy instance to recover the data. Developers will deal with the Azure Drive API only to manage the create, mount and initialize operations and after that the standard file I/O operations can be performed on it.

Use Case - Considered to be an independent and durable storage option that can help to achieve high availability and fail over for Cloud applications.

Azure Blob - Azure Blob storage offers high durability and massively scalable storage engine to develop internet scale applications. It is used to store unstructured binary data like images, documents, videos and other files. Azure Blob exposes REST endpoint to upload and retrieve blobs. When combined with Azure CDN, the objects stored in the Azure Blob will be cached across the edge locations bringing the static data closer to the consumers. Blobs are of two types - 1) Block Blob and, 2) Page Blob. Block blobs are optimized for streaming while Page blobs are optimized for random access. Azure Drive, discussed above is based on the Page Blob.

Use Case - Considered to be a highly durable storage option to keep static data that needs to be close to the consumer when combined with CDN.

Azure Table - Azure Tables are flexible entities that do not impose the requirement of a schema. It is a scale out database engine that can automatically partition the data that can be spread across multiple resources. Azure Tables is the NoSQL offering from Microsoft on the Cloud. It is exposed through standard REST endpoints to perform the normal CRUD operations. Like many scale out NoSQL databases, Azure Tables are eventually consistent. Data that need not comply with the ACID requirements are stored in Azure Tables.

Use Case - Considered to be a scale out database to store data that is written once but read many times.

Azure Queues - Azure Queues bring the asynchronous messaging capabilities to the Cloud. If you are familiar with MSMQ or IBM MQ, you will find the architecture of Azure Queues familiar. By leveraging the Azure Queues, architects can design highly scalable systems on the Cloud. Queues are the preferred mechanism to communicate across multiple role instances within a Cloud application deployed on Windows Azure. They allow us to architect loosely coupled and autonomous services that can independently scale. Messages stored in the Queue are generally delivered in FIFO pattern but this is not guaranteed. By implementing the access, process and delete pattern the right way, guaranteed message delivery can be achieved on Azure Queues. Like most of the Azure Storage services, Queues are also exposed through REST API.

Use Case - Considered to design loosely coupled, autonomous and independent components for the Cloud.

Azure Cache - Azure Cache brings the in-memory caching capabilities to Windows Azure. This is compatible with the Windows Server AppFabric Cache deployed for on-premise applications. Azure Cache comes with ASP.NET session and page output caching providers offering an out of the box experience to .NET developers. It can also be accessed through the REST API to push and retrieve objects into the cache. By storing frequently accessed data in the Azure Cache, expensive round trips to the database can be avoided.This will not only decrease the cost of I/O but also increases the overall responsiveness of the application.

Use Case - Considered to store frequently requested data and access with minimal latnecy to enhance the performance.

SQL Azure - SQL Azure is the Cloud incarnation of the flagship RDBMS, SQL Server from Microsoft. It works on the same principles of SQL Server and is based on the same protocols of SQL Server such as TDS. Developers can move their applications to the Cloud while moving the database to SQL Azure. Just by changing the connection string, applications can talk to SQL Azure. Developers can use their favorite data access model including ODBC, OLEDB or ADO.NET to talk to SQL Azure. It also exposes service management API to enable the logical administration of the database for tasks like creating users, roles and permissions. SQL Azure also includes Reporting Services for visualization and Business Intelligence for analysis. With its Pay-as-you-go model, it makes it extremely affordable to move data driven applications to the Cloud.

Use Case - Considered to build highly scalable and reliable Cloud appliations that require relational database at the data tier

Custom Databases - Though Windows Azure offers Azure Tables as a NoSQL database and SQL Azure for the RDBMS requirements, many customers might want to deploy a NoSQL database like MongoDB, CouchDB or Cassandra on Windows Azure. While these are not offered as managed offerings on subscription, Windows Azure architecture lets you deploy any of them as a part of your application. By wrapping the custom database in a Worker Role, the database can be made available to your application. Of course, the responsibility of managing and maintaining the database is left to the customer. But it is technically possible to run any light weight DB within a Worker Role. Going forward, when VM Roles are available, additional databases can also be deployed with ease.

Use Case - Considered to deploy a proprietary database on Windows Azure as demanded by the application architecture.

This was a quick summary of various storage options provided by Windows Azure. Hope you find this article useful!

- Janakiram MSV, Chief Editor,


Updates from around the world