A Peek into Amazon DynamoDB
Last week Amazon Web Services added another feather in the cap in the form of a brand new NoSQL service called DynamoDB. Though it looks like yet another announcement from the Amazon stable, this is huge. It has a strong impact not just on AWS but the industry.
In the last couple of years, there has been massive rush in the NoSQL database market. Many companies sprung up offering a variety of schema-less, scale out databases. Major Cloud platform providers have their own offerings in the form of Amazon SimpleDB, Google BigTable and Windows Azure Table storage. Recent entrants in the PaaS segment started supporting MongoDB and the likes to provide the NoSQL capabilities to the stack. So, how is DynamoDB different from Amazon’s SimpleDB and other similar NoSQL databases?
DynamoDB is designed from the ground up implementing the best practices of a Key Value database as articulated by Dr. Werner Vogels in his white paper. SimpleDB was the first step that AWS took to support a NoSQL database on the Cloud. But many customers of SimpleDB have been complaining of a few issues.
Scalability – SimpleDB has a limitation around the size of a Domain that cannot grow beyond 10GB. After this, you need to implement your own partitioning scheme and things get to become complex. This seriously breaks the promise of incremental scalability of the database on the Cloud.
Performance – Once the SimpleDB domain grows significantly in size, it will start to hamper the performance of database operations. That’s because every write operation forces re-indexing all the attribute indices. This will negatively impact the read latency.
Consistency – SimpleDB was originally designed to support eventual consistency. This confined the use of SimpleDB to specific use-cases and developers couldn’t make best use of it. Though this issue was addressed at a later point of time, consistency has always been a big trade off for the developers.
Pricing - SimpleDB’s pricing is more aligned with EC2 pricing because it is based on the ‘Box Usage’. Box Usage is based on the CPU utilization that varies depending on the complexity of the query. The final cost is arrived after calculating the actual storage and the number of ‘Box Usage’ hours consumed. This made predicting costs a complex exercise.
Amazon DynamoDB address theses issues and also offers more value. Here is a quick summary of what it is all about –
Scalability - Since DynamoDB is written from the scratch keeping scale in mind, it is much more scalable than SimpleDB or a proprietary NoSQL database running on EC2. Data can be seamlessly spread across multiple Availability Zones in a transparent way and this makes the database truly elastic. As the data grows, it automatically starts spreading across multiple resources offering the required scalability.
Performance – DynamoDB is designed to deliver high throughput and low latency. According to Amazon, applications running on EC2 will only experience single digit millisecond latencies. This is a great performance boost. Unlike SimpleDB, performance will not degrade with size. The primary factor that influences the performance is the underlying storage and in case of DynamoDB, it is the super fast Solid State Disk (SSD).
Consistency - By default, DynamoDB is eventually consistent. Developers can choose to turn on consistent reads with every GET request. But even with consistent reads, performance is not going to degrade much. Since DynamoDB is based on the foundations mentioned in the Dynamo white paper where eventual consistency is one of the key attributes.
Pricing – The parameters that influence the cost of DynamoDB are very different from SimpleDB. The two factors that will impact the pricing are the 1) storage and throughput rates. You can choose different level of throughput for read and write operarations. The more throughput you demand, the more you should be willing to pay.
Here is my personal take on Amazon DynamoDB-
- SimpleDB is still in beta and no official announcement talks about the roadmap and the future of SimpleDB. Personally, I think there are different use-cases that still target SimpleDB. Storing application configuration, logs and simple metadata in SimpleDB still makes sense. When you need consistent reads, elasticity and better performance, DynamoDB is the best.
- The API is incompatible between SimpleDB and DynamoDB. This makes it extremely difficult for customers to switch over with the same code base. This forces extra effort of migrating data and then altering the code for DynamoDB. I wish AWS maintained API compatibility. But given the fundamental differences in the architecture, I know it is not easy.
- Pricing of DynamoDB is still not simple. It requires quite a bit of guesswork combined with storage predictability. When compared to SimpleDB, this may look simple but it is not!
- I am glad that AWS didn’t opt to provide CPU/Memory/Storage based choice (Like RDS) to provision DynamoDB. The recent addition to AWS, ElastiCache followed the provisioning and pricing model of RDS by forcing the configuration of a machine. That doesn’t offer easy scale out capability on the fly. By letting the customers decide on the storage and I/O throughput, it offers more control and brings true elasticity. The other great benefit with this model is that it doesn’t require scheduled maintenance window. This is one of the smartest moves from AWS and answers the question on why not run my own cluster of MongoDB / Cassandra on EC2?
- Synchronous replication across Availability Zones is a great feature that offers better scalability and availability. Again, this is a huge bonus for anyone looking at running a production application talking to DynamoDB. Managing a NoSQL cluster on EC2 is not an easy task!
- It’s great to see DynamoDB show up in the AWS Management Console from day one! Same is the case with CloudWatch metrics for DynamoDB. Developers are happy to see the SDKs (including Java, .NET and Android) refreshed to support the new APIs and tools. But like any other AWS service, this is only launched in US-EAST! I hope to see this come to APAC soon!
- I am working on a simple ASP.NET application that leverages DynamoDB at the backend. Stay tuned for a post on this How-To tutorial!
- Janakiram MSV, Chief Editor, CloudStory.in