Cassandra vs. MongoDB: Performance and Feature Comparison
For organizations considering their open source NoSQL database options, Cassandra and MongoDB are often near the top of the list. However, Cassandra and MongoDB are very different options — each with their own strengths, weaknesses, and unique use cases.
In this blog, we look at those differences, and compare Cassandra vs. MongoDB performance, benefits, and use cases.
Comparing Cassandra vs. MongoDB
At the surface, Cassandra and MongoDB can appear similar. Both are NoSQL databases, both are open source software, and both aren't suitable for ACID compliance. But before we get too deep into their benefits and features, let's get an overview of each of these open source NoSQL DBMS.
What Is Cassandra?
Cassandra is an open-source NoSQL database written in Java maintained by the Apache Software Foundation.
It offers high availability and scaling and capable of handling high volumes of data and unstructured data types. By not requiring a fixed schema, Cassandra is able to handle things like replication much easier than other databases.
Originally from the developers over at Facebook, Apache Cassandra was developed to handle searching of the inbox. It was made open source in 2008 and later became an Apache project in 2009.
What Is MongoDB?
MongoDB is a popular NoSQL document-oriented database developed by MongoDB Incorporated.
The word “Mongo” was derived by the database’s ability to store humongous amounts of JSON data. Documents can have any schema which is unlike a relational database management system (RDBMS). This means data in related tables can be joined into a single document in affect de-normalizing the data.
Back to topCassandra vs. MongoDB: Performance
Comparing performance between Cassandra and MongoDB is difficult to quantify. You cannot take the same application and data model and test it on both platforms and conclude one database performs better than the other. Each one is uniquely suited for different types of data models and loads. Instead, focus on how each database addresses concerns such as consistency, availability, and partition tolerance per your application requirements.
Write Performance
Both MongoDB and Cassandra require writes to the database to be on a primary node, but how they handle writes is different. Since Cassandra has multi-primary node support, the architectural design of Cassandra enables it to handle many simultaneous writes to more than one node. It will be more write performant than MongoDB which is limited to one writable primary node per replica set. Secondary servers can only be used for reads.
Read Performance
This is where MongoDB really shines especially with consistency. By default, secondary nodes cannot accept read requests, but they can be easily configured for reads by setting a “read preference”. This means the entire replica set can accept reads, but this is also possible to setup with Cassandra as well.
MongoDB does have an advantage if your data model includes nested objects which require indexes as it has better support for secondary indexes. Cassandra however only has cursory support for secondary indexes. Secondary indexes are also limited to single columns and what are called equality comparisons. If you are mostly querying by the primary key, Cassandra is the better choice. It really depends on your data is modeled and how you query your data.
Scalability
Both Cassandra and MongoDB vertically and horizontally scale seamlessly. There are generous documented limits on the number of nodes in a cluster, but keep in mind there are practical limits depending on your server architecture.
Back to topNeed Help with Cassandra or MongoDB?
Our unbiased open source experts are here to provide technical support and professional services.
Cassandra vs. MongoDB: Key Differences
As alluded to earlier, there are a number of differences between Cassandra and MongoDB, including query language / data model, architecture, aggregation, consistency, data types, and more.
Query Language / Data Model
The most striking difference between the databases is the query language and data model. Cassandra mimics traditional RDBMS’s SQL with its own Cassandra Query Language (or CQL) and column and row table structure allowing different columns to be stored per row. MongoDB has a JavaScript command line interface and organizes its documents into collections and databases using a JSON format. MongoDB is going to provide the most flexibility to define a fixed or variable schema for your data.
Architecture
As previously discussed, MongoDB uses a primary/secondary architecture with one and only one node accepting writable content with the rest allowing read requests. Cassandra requires each node to be an active member within a ring by passing queries to the appropriate node where the data resides. Cassandra has an always-on architecture when a node fails. Failover within a MongoDB replica set can take up to a minute where a secondary is promoted as a writable primary.
Aggregation
MongoDB includes an aggregation framework for transforming data by stages instead of looking at a huge dataset, but Cassandra requires an add-on such as Apache Hadoop or Spark.
Consistency
MongoDB is good at consistency where you can query multiple nodes in a replica set and get the same data where Cassandra offers tunable consistency at the cost of performance.
Data Types
MongoDB has a rich set of data types which include String, Numeric, Boolean, Min/Max keys, Arrays, Timestamps, Object, Null, Symbol, Date, Object ID, Binary, Code, and Regular Expression. Cassandra supports data types including built-in (similar to MongoDB), collections (maps, sets, and lists), and user-defined data types. The maximum document size is 16 megabytes in MongoDB where Cassandra allows 2GB for a single column (not recommended).
Security
MongoDB and Cassandra both support role-based access control functionality as well as node-to-node and client-to-node TLS/SSL transport security. MongoDB requires an enterprise license for LDAP and Kerberos authentication, but you download a free extension for Cassandra for LDAP and Kerberos.
Apache Cassandra 4 added audit logging to track user access and activity where live traffic can be replayed.
Encryption at rest is an enterprise feature with MongoDB and Cassandra. You would need to explore volume or operating system file encryption for this functionality.
Licensing
MongoDB has a multi-license system, offering a version covered by Server Side Public License for all versions released after October 16th 2018. Prior versions to that are licensed under the GNU Affero General Public License v3. There are enterprise versions of MongoDB available which have their own licensing. Cassandra is simply licensed under the well-known Apache License v2.0.
Back to topCassandra Use Cases
Cassandra is very good handling write-heavy workloads such as transaction logging, time series data, tracking of inventory, Internet of Things status and events, tracking the weather and much more where data is inserted and rarely updated. Cassandra is a great option for geographically distributed data.
Cassandra favors partition tolerance and being available over things like write consistency. While Cassandra is a great database, it does not have all the features of a relational database such as transactions or a way to lock data. Joins are also not implemented in Cassandra.
Back to topMongoDB Use Cases
MongoDB is also an excellent choice for big data workloads that Cassandra is very good at which also includes content management, analytics, and time series data. MongoDB’s built-in aggregation framework allows pulling data into a central database providing a single view of the data. Some other great use cases for MongoDB are things like content management systems, managing product data, real-time data integration and high speed data logging.
Back to topFinal Thoughts
Choosing the right database is a big decision for any team, and one that should always center on the needs of the application now, and what those needs will be in the future. With that in mind, Cassandra and MongoDB can both be great options for inclusion in modern, scalable data stacks.
Get Technical Support for Your Open Source Data Stack
From ActiveMQ to MongoDB, OpenLogic can support the open source data technologies that power your enterprise. Talk with an expert today to see how 24/7/365 support delivered by our enterprise architects can help your team find success.
Additional Resources
- Blog - Apache Cassandra Overview
- Blog - MongoDB Overview
- Blog - PostgreSQL vs. MongoDB
- Blog - Big Data on Demand With MongoDB
- White Paper - Decision Maker's Guide to Open Source Databases
- White Paper - The New Stack: Cassandra, Kafka, and Spark
Back to top