Live instructor-led & Self-paced Online Certification Training Courses (Big Data, Hadoop, Spark) Forums Hadoop how to change the replication factor for existing data already present in HDFS?

This topic contains 1 reply, has 1 voice, and was last updated by  dfbdteam3 1 year, 6 months ago.

Viewing 2 posts - 1 through 2 (of 2 total)
  • Author
    Posts
  • #4690

    dfbdteam3
    Moderator

    A File is already loaded to HDFS. Now the replication Factor of it needs to be changed, how to change this?

    #4691

    dfbdteam3
    Moderator

    If the file was loaded into HDFS with a default Replication Factor of 3, which is set in hdfs-site.xml. The replication of that particular file would be 3, which means 3 copies of the block exists on the HDFS.

    Now, if we want to change the replication factor of the existing content in HDFS, which in our case is set to 4.

    • we can change the dfs.replication value to 4 in $HADOOP_HOME/conf/hadoop-site.xml file. Which will start replicating to the factor of 4 for any new content that comes in.
    • If we are looking to change for a specific file or a Directory, you can use the below commands to do that.
      To set replication of an individual file to 4:
    $HADOOP_HOME/bin/hadoop dfs -setrep -w 4 /path of the file
    • we can also do this on a Directory, which will change for all the files under it recursively.
      To change replication of entire directory under HDFS to 4:
    ./bin/hadoop dfs -setrep -R -w 4 /Directory path

    – this is specific to a directory which we mention and if we give / (root)then it would do for all the files under it.

    For more details, please follow: HDFS Tutorial

Viewing 2 posts - 1 through 2 (of 2 total)

You must be logged in to reply to this topic.