Site icon DataFlair

MongoDB Replication – 2 Major Strategies of Sharding in MongoDB

MongoDB Replication and Sharding

FREE Online Courses: Enroll Now, Thank us Later!

After studying MongoDB Aggregation, it’s time to learn MongoDB Replication and Sharding. Replication instances that maintain the same data set and MongoDB Sharding consists of 3 parts.

Here, we will explore how to set up a replica set, MongoDB Sharded, and Non-Sharded Collections with their workings.

So, are you ready to explore MongoDB Replication and Sharding?

What is MongoDB Replication?

As the name says, MongoDB replication means instances that maintain the same data set. It contains several data bearing nodes and optionally one arbiter node.

Out of all the data bearing nodes, only one of them is a primary node while the others are secondary nodes. A primary node can do all the write operations. A replica set containing primary node is can confirm writes with {w: “majority”}.

The secondary nodes replicate the primary one and apply the operations to their respective dataset. When the data is reflected in the second one it also changes on the primary dataset. If the primary node is not available then the secondary nodes from themselves can elect one of them as a primary node.

Here arbiters do not have a dataset with themselves. Its purpose is to maintain a quorum in a replica set by responding to heartbeat and election requests by other replica members. If your replica set has an even number of members, add an arbiter to obtain a majority of votes in an election for a primary node.

Technology is evolving rapidly!
Stay updated with DataFlair on WhatsApp!!

Automatic Failover

When a primary node does not communicate with other members of the set for a certain period of time i.e. electionTimeoutMills(10 seconds by default) period, then an eligible secondary node calls out for an election.

The clusters present over here try to complete the election as fast as possible so that they can return to the normal operations to be performed.

Here, the replica set cannot process write operation until the election is completed.

How to Set Up a Replica Set in MongoDB?

Here, we will learn how to convert a standalone MongoDB instance to the replica set. Following are the steps to convert:

mongod --port "PORT" --dbpath "YOUR_DB_DATA_PATH" --replSet "REPLICA_SET_INSTANCE_NAME"

Now, we will take an example to understand it better.

mongod --port 27017 --dbpath "D:\set up\mongodb\data" --replSet rs0

Now to add members to replica set we will use the following syntax:

>rs.add(HOST_NAME:PORT)

What is MongoDB Sharding?

A sharded cluster consists of the following components:

The following diagram describes the interaction of components within a sharded clusters in MongoDB.

MongoDB Sharding Strategies

There are two types of strategies offers by MongoDB Sharding:

  1. Hashed Sharding
  2. Ranged Sharding

i. Hashed Sharding in MongoDB

It involves computing a hash of the shard key field’s value. Each chunk is assigned a range according to the hash value.

Even though the range of shard keys may be close but their hashed values are not on the same chunk. This kind of sharding facilitates even distribution of data.

ii. Ranged Sharding in MongoDB

It involves dividing data into ranges based on the shard key values. After that, each chunk is assigned some value based on the shard keys.

A range of shard keys who are having very close values is supposed to be present in the same chunk. Its efficiency depends upon the shard key chosen. In the worst case shard keys can result in uneven distribution of data, which results in opposition to some benefits of sharding in MongoDB.

Now we will take an example to study MongoDB Sharding.

mkdir /data/exampledb
mongod –exampledb ExamplesD: 27019
mongos –exampledb ExamplesD: 27019
mongo –host ServerD –port 27017
sh.addShard("S1:27017")
sh.addShard("S2:27017")
sh.enableSharding(Studentdb)
Sh.shardCollection("db.Student" , { "Studentid" : 1 , "StudentName" : 1})

Sharded and Non-Sharded Collections

A database is a mixture of sharded and unsharded collections in MongoDB.

MongoDB sharded collections are partitioned and distributed in clusters.

MongoDB unsharded collections are stored on a primary shard.

The connection of a Sharded Cluster:

We have to connect the mongos router to interact with any collection in the sharded cluster. It will include both.

Summary

Hence, we have learned about MongoDB replication and sharding. We studied automatic failure, setting the replica set, and strategies for sharding in MongoDB. So, this was all about MongoDB replication and sharding. Hope, you liked our explanation. If you have any query, please post it in the comment section.

Exit mobile version