How we can change Replication factor when Data is already stored in HDFS

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop How we can change Replication factor when Data is already stored in HDFS

Viewing 2 reply threads
  • Author
    Posts
    • #5689
      DataFlair TeamDataFlair Team
      Spectator

      We can set replication factor property globally in hdfs-site.xml, but for some specific files how to specify different replication factor (which is already stored in HDFS)?

    • #5692
      DataFlair TeamDataFlair Team
      Spectator

      we can change the replication factor for the data stored in the HDFS by using the below command,

      hadoop fs -setrep -R 5 /

      We can define the replication factor for a file or directory or an entire system by specifying the file or directory or an entire system in the above command

      File:
      hadoop fs –setrep –w 3 /my/file
      Directory:

      hadoop fs –setrep –w 3 -R /my/dir

    • #5693
      DataFlair TeamDataFlair Team
      Spectator

      The replication factor is a property that can be set in the HDFS configuration file( hdfs-site.xml).This will be to set global replication factor for the entire cluster.This will only work on the newly created files but not on the existing files.

      Change the replication factor on a per-file basis :
      hadoop fs –setrep –w 3 /file/filename.xml

      -setrep commnad to change the replication factor for files that already exist in HDFS.-R flag would recursively change the replication factor on all the files

      eg:
      hadoop fs –setrep –w 3 -R /directory/dir.xml

Viewing 2 reply threads
  • You must be logged in to reply to this topic.