decorative image showing an apache cassandra database
July 1, 2021

What Is Cassandra? Features, Benefits, and FAQ

Databases
Open Source

Apache Cassandra is an increasingly popular open source option for organizations who handle large amounts of unstructured data, and need high availability and scaling for that data.

In this blog, we give an overview of Apache Cassandra — with details on how it works, how it's used, and a rundown of key features and benefits.

Back to top

What Is Apache Cassandra?

Apache Cassandra is a NoSQL database written in Java. It offers high availability and scaling, and is capable of handling high volumes of data and unstructured data types. By not requiring a fixed schema, Apache Cassandra is able to handle things like replication much easier than other databases.

Originally a brainchild of the developers over at Facebook, Apache Cassandra was developed to handle searching of the inbox. It was made open source in 2008 and later became an Apache project in 2009.

 

Related

Structured

Read

Write

Consistency

Availability

Partition Tolerance

Cassandra

X

Heavy

Heavy

Medium

High

High

Back to top

How Does Apache Cassandra Work?

At its core, Apache Cassandra is a peer-to-peer system whose design is based on two key products, DynamoDB and Google’s Big Table. Using cluster nodes that all have read/write permissions eliminates the need for any master nodes, as each node is treated as an equal. When thinking of a cluster, it’s easier to envision groups of data centers rather than just individual servers. The beauty of Apache Cassandra is that you can add endless nodes to the cluster and expand your database as you need to.

Back to top

What Is Apache Cassandra Used For?

With the way it handles data, Apache Cassandra can handle structured, semi-structured and unstructured data, providing a great level of flexibility. It’s designed to be used with multiple data centers and as such it makes for easy data distribution. While Cassandra isn’t necessarily your traditional database, it is still ACID (Atomicity, Consistency, Isolation and Durability) compliant.

One of the biggest feature sets of Apache Cassandra is its ability to create an environment without a single point of failure. This decentralized approach makes it a great fit for organizations who have constantly growing or changing data needs, or have data that can’t ever go down.

Back to top

Five Apache Cassandra Features and Benefits

There are a number of features that make Apache Cassandra an attractive option for enterprises, but its scalability, write speed, fault tolerance, and capacity for performance tuning make it stand out.

1. High Scalability

Adding nodes to the Cassandra cluster is meant to be easy and available at any given time as your needs grow. Instead of growing vertically, Apache Cassandra is meant to grow horizontally as much as you need it to and across as many geographical sites as needed.

2. Fast Writes

The way that Cassandra handles data allows for it to write to the database quickly. Because data can come in unstructured, you can essentially just chuck your data into the database at ridiculous speeds.

3. Fault Tolerant

Because all nodes are treated as equals, when one goes down, it’s not a real big deal. You can essentially add enough nodes that you will never go down into a full blown “lights out” scenario.

4. Cassandra Query Language

With Standard Query Language (SQL) you are dealing with relational databases, which are better suited to scaling vertically, deal with table-based data and fixed schemas for moderate volumes of data. Because Cassandra is NoSQL, you can move data horizontally across the clusters easier, have the potential for massive scalability, and is not subject to the confines of joins and fixed schemas. 

5. Tunable Consistency

Apache Cassandra allows for a great deal of performance tuning on top of your typical JVM performance tuning. Another option that is often overlooked is table level compression options that are capable of being configured when creating or changing tables, a feature that is enabled by default. 

Back to top

Apache Cassandra FAQ

In this FAQ, we answer common, high-level questions surrounding Apache Cassandra.

Is Cassandra a Relational Database?

No, Cassandra is not a relational database, as its design does not support the relational data model. To elaborate, a relational model assumes all data is represented as n-ary relations which is a subset of the Cartesian product of n domains. How Cassandra differs is by modeling data as key-value stores, values being represented as rows. Because there’s no enforcement that all rows in a table have the same columns, which is required by the relational model. 

Is Cassandra a NoSQL Database?

Yes, Cassandra is a NoSQL database and uses a NoSQL model due to the way it can intake and process massive amounts of data at incredible speeds. 

Is Cassandra Free?

Yes, Cassandra is free to use and licensed under the Apache License 2.0.

Is Cassandra Popular?

Yes. Cassandra is incredibly popular, used by thousands of companies around the world. On the project website, they have case studies from some of the largest logos in the world. It's also ranked at #11 on DB-Engines, and is the #1 ranking wide column database.

Is Cassandra Open Source?

Yes, Cassandra is an open source Apache project operating under the Apache License 2.0.

Back to top

Final Thoughts

Apache Cassandra is meant for NoSQL systems that need to store a lot of data and distribute that data as much as possible. Companies who have the need for writing a lot of data quickly and reliably will be successful with it.  However, for businesses more interested in rock-solid integrity or preservation of data structure, a document database may be a better choice.

Get Guidance, Support, and Services for Apache Cassandra

Whether you are implementing or considering Cassandra for your stack, OpenLogic can help. Contact an expert today to see how we can make your open source database journey a success.

See Database Support Options

Additional Resources

Back to top