- White Papers
Overview
Many people consider Apache Cassandra® and DynamoDB as potential datastore technologies when looking to build high-scale, high-reliability services in the cloud. Both technologies are popular and well-proven to deliver at scale. However, choosing the technology most appropriate for your use case can have a significant impact on the cost of building, maintaining, and running your application.
This whitepaper considers a real-world use case, analyzes the costs of running on Instaclustr Managed Apache Cassandra vs DynamoDB, and discusses how the features and cost models of the two technologies could impact the architecture of your solution. The use case we are considering is at the heart of Instaclustr’s monitoring system, Instametrics.
The key attributes of the Instametrics cluster
- 36 i3.2xlarge nodes (co-hosting Apache Cassandra and Apache Spark) (this cluster runs continuously with no scaling up/down for peaks).
- Each metric event written is, on average, ~100 bytes of data.
- Baseline load (raw metrics received) of 3060 batch writes per second. Each batch contains ~150 rows for a total of ~460k writes/second baseload.
- Additional load when writing roll-up results in 16,200 batch writes/second. Each batch contains ~100 rows for a total of 1.6M writes/second from this load and total peak of just over 2M writes per second. This peak load occurs for about 1 minute out of every 5 (20% of the time).
- The baseline read load on the cluster is about 18,000 reads per second. Each read retrieves ~15 rows for a total baseline read load on the cluster of 270k rows/sec.
- Additional loads when reading data for the roll-ups is about 144,000 reads per second. These reads are actually using Cassandra functions to aggregate data before returning with each read using data from ~15 rows for 2.1M rows/sec read in total. The cluster is also at peak read load for about 20% of the time.
- The cluster currently stores around 54TB of data with a replication factor of 2.
- Fill out the form on the right to download the white paper.
Thank you for your submission
Download Resource-
- Videos
InstaBlinks EP 22: The Truth About Using Kafka as a Database
Watch the episode of InstaBlinks and learn why Apache Kafka is almost never your best choice as a database, but how it is a vital part of any durable data infrastructure and a valuable data store.
-
InstaBlinks EP 21: Migrating Cassandra
Join Ben Bromhead and Carlos Rolo to discover numerous advantages, from cost savings to avoiding data lock-in, as they highlight reasons why you should choose open source Cassandra over proprietary versions.
-
- Videos
How to Spin Up a Cluster: Apache Cassandra Platform Demo with Instaclustr
Discover the awesome power of open source and see how easy it is to spin up your first Apache Cassandra®️ cluster with Instaclustr (and how to prepare for Day 2 operations, too)!