Cassandra Architecture and It’s Key Terms – Complete Guide
Job-ready Online Courses: Click, Learn, Succeed, Start Now!
1. Objective
In our last Cassandra Tutorial, we saw Cassandra Applications. Today, we will learn about Cassandra Architecture. Before starting we should be familiar with some key terms of Cassandra Architecture.
So, let’s learn Cassandra Architecture in detail.
2. Key Terms Of Cassandra Architecture
Below, we are discussing some key terms in the architecture of Cassandra:
a. Cassandra Nodes
It is the basic fundamental unit of Cassandra. Data stores in these units(computer/server).
b. Cassandra Data Center
Cassandra Datacenter, basically a collection of related Cassandra nodes. A centralized place to accommodate computer and networking system to meet the needs of an organization’s information technology.
Let’s discuss Cassandra Data Model
c. Cassandra Rack
A rack is a unit that contains all the multiple servers all stacked on top of another. A node is a single server in a rack.
d. Cassandra Cluster
A collection of many data centers form a Cassandra cluster. It can be spanned to physical locations.
e. Cassandra Commit log
Every writes operation performs in a commit log to ensure the durability of the data. After it has been flushed to an SSTable data archives or delete or change here. It is like a crash recovery mechanism.
Best Apache Cassandra Books to gain Knowledge
f. MemTables
Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!
A temporary memory location where we write data during updates or deletion. Data is written in memtables after it has been written in the commit log. When the data in memtables is full, we flush them to the disk to from SSTables
g. SSTables
SSTables, the fixed set of data files in which Cassandra writes memtables periodically. These are appended only, which means that we can add data at the end of the file thus helping in the sequential storage in the disk. These maintains for each Cassandra table.
Let’s explore Cassandra User Defined Types
h. Data Replication
Imagine a situation if one of the nodes goes down in a data center then a part of information will lost. Thus to overcome this limitation, Cassandra made replicas of data on various nodes. This is called replication. This ensures fault tolerance and reliability. Â Â
3. What is Cassandra Architecture?
Cassandra takes hardware failure into consideration. Thus, it possesses plans of contingency to avoid such failures. It consists of a ring type structure i.e. its nodes are logically distributed like a ring. Thus it has no master or slave nodes.
It makes replicas of data on several homogenous nodes of the cluster. Each information exchanges among the nodes of the cluster every second. A sequentially written commit log on each node captures write activity to make sure data durability.
This data is then indexed and written to memtable. Once the memtable is full, we write data on disk on SSTable data file.
All the data is partitioned and replicated to other nodes automatically. By using a process known as compaction. Cassandra periodically updates SSTables and remove outdated data and tombstones.
A client can make read/write request to any node in the cluster. That particular node, also called coordinator, acts as a proxy between a client’s application and the node which has the required data.
Do you know about Cassandra API
a. Data Replication
As we all now know that to avoid a single point of failure, Cassandra makes replicas of data on several nodes. Here, there are two things that are important to understanding the process correctly:
- Replication Factor: Replication means the no. of copies maintained on different nodes. Replication Factor of 3 means, 3 copies of data maintained on 3 different nodes. So if 2 of the nodes go down we still have one copy of data safe.
- Replication Strategy: There is two replication strategy.
Simple strategy: This strategy is used when there is only one data center, data is copied in a clockwise manner on all the nodes.
Have a look at Cassandra vs HBase
Network topology strategy: This strategy is highly recommended as there is a possibility to expand according to the future use.
Here rack set of data for each data center place separately in a clockwise direction on different racks of the same data center. This process continues until it reaches the first node.
Do you Know about Cassandra Monitoring Tools
So, this was all about Cassandra architecture and the Key terms of Cassandra Architecture. Hope You like our explanation
4. Conclusion
Hence, we saw Cassandra architecture. Moreover, we discussed the different Key Terms of Cassandra Architecture such as Cassandra nodes, Datacenter, SStables, Memtables, Cassandra Cluster, Commit log etc. Also, we looked at Data Replication, replication factor, and Strategy.
Finally, we discussed Simple Strategy and Network Topology Strategy. In the next article, we will learn about the Cassandra Data Model. Furthermore, if you have any query, feel free to ask in the comment section.Â
See also –
Cassandra Shell Commands
Your opinion matters
Please write your valuable feedback about DataFlair on Google