Cassandra Architecture and It’s Key Terms – Complete Guide

Job-ready Online Courses: Click, Learn, Succeed, Start Now!

1. Objective

In our last Cassandra Tutorial, we saw Cassandra Applications. Today, we will learn about Cassandra Architecture. Before starting we should be familiar with some key terms of Cassandra Architecture.
So, let’s learn Cassandra Architecture in detail.

Cassandra Architecture

Cassandra Architecture a Complete Guide

2. Key Terms Of Cassandra Architecture

Below, we are discussing some key terms in the architecture of Cassandra:

a. Cassandra Nodes

It is the basic fundamental unit of Cassandra. Data stores in these units(computer/server).

b. Cassandra Data Center

Cassandra Datacenter, basically a collection of related Cassandra nodes. A centralized place to accommodate computer and networking system to meet the needs of an organization’s information technology.
Let’s discuss Cassandra Data Model

c. Cassandra Rack

A rack is a unit that contains all the multiple servers all stacked on top of another. A node is a single server in a rack.

d. Cassandra Cluster

A collection of many data centers form a Cassandra cluster. It can be spanned to physical locations.

Cassandra Architecture

Cassandra Architecture- Cassandra Cluster

e. Cassandra Commit log

Every writes operation performs in a commit log to ensure the durability of the data. After it has been flushed to an SSTable data archives or delete or change here. It is like a crash recovery mechanism.
Best Apache Cassandra Books to gain Knowledge

f. MemTables

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

A temporary memory location where we write data during updates or deletion. Data is written in memtables after it has been written in the commit log. When the data in memtables is full, we flush them to the disk to from SSTables

g. SSTables

SSTables, the fixed set of data files in which Cassandra writes memtables periodically. These are appended only, which means that we can add data at the end of the file thus helping in the sequential storage in the disk. These maintains for each Cassandra table.
Let’s explore Cassandra User Defined Types

h. Data Replication

Imagine a situation if one of the nodes goes down in a data center then a part of information will lost. Thus to overcome this limitation, Cassandra made replicas of data on various nodes. This is called replication. This ensures fault tolerance and reliability.   

3. What is Cassandra Architecture?

Cassandra takes hardware failure into consideration. Thus, it possesses plans of contingency to avoid such failures. It consists of a ring type structure i.e. its nodes are logically distributed like a ring. Thus it has no master or slave nodes.

It makes replicas of data on several homogenous nodes of the cluster. Each information exchanges among the nodes of the cluster every second. A sequentially written commit log on each node captures write activity to make sure data durability.

This data is then indexed and written to memtable. Once the memtable is full, we write data on disk on SSTable data file.

All the data is partitioned and replicated to other nodes automatically. By using a process known as compaction. Cassandra periodically updates SSTables and remove outdated data and tombstones.

A client can make read/write request to any node in the cluster. That particular node, also called coordinator, acts as a proxy between a client’s application and the node which has the required data.

Do you know about Cassandra API

a. Data Replication

As we all now know that to avoid a single point of failure, Cassandra makes replicas of data on several nodes. Here, there are two things that are important to understanding the process correctly:

  1. Replication Factor: Replication means the no. of copies maintained on different nodes. Replication Factor of 3 means, 3 copies of data maintained on 3 different nodes. So if 2 of the nodes go down we still have one copy of data safe.
  2. Replication Strategy: There is two replication strategy.

Simple strategy: This strategy is used when there is only one data center, data is copied in a clockwise manner on all the nodes.

Have a look at Cassandra vs HBase

Cassandra Architecture

Cassandra Architecture- Simple Strategy

Network topology strategy: This strategy is highly recommended as there is a possibility to expand according to the future use.

Cassandra Architecture

Cassandra Architecture- Network Topology Strategy

Here rack set of data for each data center place separately in a clockwise direction on different racks of the same data center. This process continues until it reaches the first node.

Do you Know about Cassandra Monitoring Tools
So, this was all about Cassandra architecture and the Key terms of Cassandra Architecture. Hope You like our explanation

4. Conclusion

Hence, we saw Cassandra architecture. Moreover, we discussed the different Key Terms of Cassandra Architecture such as Cassandra nodes, Datacenter, SStables, Memtables, Cassandra Cluster, Commit log etc. Also, we looked at Data Replication, replication factor, and Strategy.

Finally, we discussed Simple Strategy and Network Topology Strategy. In the next article, we will learn about the Cassandra Data Model. Furthermore, if you have any query, feel free to ask in the comment section. 
See also –
Cassandra Shell Commands

Your opinion matters
Please write your valuable feedback about DataFlair on Google

courses

DataFlair Team

DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.

Leave a Reply

Your email address will not be published. Required fields are marked *