What do you mean by metadata in HDFS?

Viewing 3 reply threads
  • Author
    Posts
    • #5280
      DataFlair TeamDataFlair Team
      Spectator

      List the files associated with metadata in HDFS?
      Explain FsImage and EditLogs in Hadoop?
      Where is metadata stored in Hadoop?

    • #5286
      DataFlair TeamDataFlair Team
      Spectator

      We have two types of Metadata available on Hadoopnamenode.
      One is “File to Block(s) mapping” metadata and another one is “Block to Datanode(s) mapping”metadata. These two are stored in memory.

      File to block mapping metadata information is also stored in fsimage_xxxx files for permanent storage (i.e., Checkpoint node will take care of creating a new fsimage to update latest File to block mapping metadata information by merging edit logs file and last updated fsimage file. And this new fsimage is copied to namenode. Namenode doesn’t updated fsimage file but it has fsimage file). Block to datanodes mapping is re-constructed/constructed in memory at the time of namenode start-up/restarted.

      Both master node and slave node(s) contains metadata information. Master node contains HDFS state namespace(metadata) and transaction information(edit logs). Slave nodes contains checksum(.meta file) metadata for each and every particular block on slaves nodes.

      • On Master node, under “/hdata/dfs/name/current” path, we have files prefix with fsimage and edits. These files are stored permanently on disk(i.e., On master node). fsimage stands for File System Image. The file name of fsimage looks like “fsimage_xxxx”, where xxxx is the number. In general, only two or three fsimage files will be available under /hdata/dfs/name/current” path.
      • Edit logs are represented as two different type of files. One type of file is: edits_xxxx_oooo, where xxxx indicates the start of the transaction number, oooo indicates the end of the transaction number and another type of file is: edits_inprogress_xxxxxxxxx files. edit_xxxx_oooo file contains previously generated transaction information. edits_inprogress_xxxxxx file contains current+particular time of transaction information.

      When edits_xxxx_oooo files got created?
      When edits_inprogress_xxxxxx file got updated or created or removed on namenode?.

      When you setup the Hadoop cluster first time and you started hadoop daemon by running start-dfs.sh and start-yarn.sh file, you will see only fsimage_xxxx and edits_inprogress_xxxxxx files under “/hdata/dfs/name/current” path. It doesn’t contain edits_xxxx_oooo files.

      When Namenode is started, it will be in safe mode, that means you can’t perform any write operation on HDFS. But you can read the data or listing of files..etc. In the starting of namenode, whatever the file system namespace is available in latest fimsage_xxxxx file is loaded into memory, and edits_inprogress_xxxxxx files is also loaded to replay the transaction done at runtime. This replaying is only happened only when there is no synchronization between latest fsimage_xxxx and edits_inprogress_xxxxxx log files. Namenode will be in safe mode until the block reports from datanodes are received by namenode. Once namenode receives all block report from all the datanodes, then namenode will leave the safe mode.

      What kind of data will be available in block report.It contains block id, generated timestamp, block length….etc

      fimsage_xxxxx is loaded once into memory either when you restarted namenode, boot-up or failure of namenode(if automatic restart happens). Namenode doesn’t updated or perform changes to the fsimage_xxxx files which is available on disk.

      When you are performing operations like file creation, modification, permission changing…etc, at that time edits_inprogress_xxxxxx file and in-memory got updated. Namenode doesn’t update the record in edit_xxxx_oooo files.

      What kind of metadata is updated/added in edits_inprogress_xxxxxx and memory will be seen below.

      Let say namenode is up and running, that mean HDFS state namespace information is loaded into memory.

      Under root, I want to create a directory with name “test1”. Below is the command which i fired from client node.

      hdfs dfs -mkdir /test1

      In order to create a directory, First of all, client has to interact with namenode with directory name and where you want to create(path) this directory. Now namenode performs various checks on available metadata information in memory. If it is not available, then a record is added to edits_inprogress_xxxxxx file, and also memory got updated with below information. Transaction information is stored in edits_inprogress_xxxxxx file and HDFS state metadata information is
      stored in memory (ram, but it also store in disk– Namenode is not going to update this metadata information. There is separate node called secondarynamenode/checkpoint node that create a new fsimage files in order to synchronize with the edit logs. Namenode will have fsimage_xxxx file but fsimage_xxxx file creation is done by secondarynamenode/checkpoint node)

      Note: Only one transaction is required for creating a directory

      What is record in edits_inprogress_xxxxxx file?
      A record is added to the edits_inprogress_xxxxxx file, Under record tag, there will be multiple tags. In the below, OPCODE indicates that we are creating the directory.It also contains TXID, LENGTH(Directory length), PATH, USERNAME, GROUPNAME…etc. Only this information is stored in an edits_inprogress_xxxxxx file.

      In edits_inprogress_xxxxxx file: Here the transaction id is 1.

      <?xml version="1.0" encoding="UTF-8"?>
      <EDITS>
        <EDITS_VERSION>-59</EDITS_VERSION>
        <RECORD>
          <OPCODE>OP_MKDIR</OPCODE>
          <DATA>
            <TXID>1</TXID>
            <LENGTH>0</LENGTH>
            <INODEID>16387</INODEID>
            <PATH>/test1</PATH>
            <TIMESTAMP>1498465936863</TIMESTAMP>
            <PERMISSION_STATUS>
              <USERNAME>hdadmin</USERNAME>
              <GROUPNAME>supergroup</GROUPNAME>
              <MODE>493</MODE>
            </PERMISSION_STATUS>
          </DATA>
        </RECORD>
      </EDITS>

      In Memory: Below information is stored when we are creating a directory with name test1.

      <inode>
      	<id>16387</id>
      	<type>DIRECTORY</type>
      	<name>test1</name>
      	<mtime>1498470314856</mtime>
      	<permission>hdadmin:supergroup:rwxr-xr-x</permission>
      	<nsquota>-1</nsquota>
      	<dsquota>-1</dsquota>
      </inode>

      For each and every transaction, one separate record is going to add in edits_inprogress_xxxxxx file.
      For each record, transaction id will be unique, which can be sequentially incremented for each and every record.

      In summary,
      1) For each transaction, a record will be added in the edits_inprogress_xxxxxx file.

      2) For each object(file, directory), an inode is added in memory. In “inode”, for a directory, it contains id, type(whether it is a file or directory, name of the directory,…etc

      Next, I have created one more directory under root with name as test2. A record is added in edits_inprogress_xxxxxx file and at the same time in-memory also got updated with below information.

      In edits_inprogress_xxxxxx file: It looks like below and the transaction id is 2.

      <RECORD>
          <OPCODE>OP_MKDIR</OPCODE>
          <DATA>
            <TXID>2</TXID>
            <LENGTH>0</LENGTH>
            <INODEID>16388</INODEID>
            <PATH>/test2</PATH>
            <TIMESTAMP>1498465944002</TIMESTAMP>
            <PERMISSION_STATUS>
              <USERNAME>hdadmin</USERNAME>
              <GROUPNAME>supergroup</GROUPNAME>
              <MODE>493</MODE>
            </PERMISSION_STATUS>
          </DATA>
        </RECORD>

      In Memory: Below information is stored when we creating a directory.

      <inode>
      	<id>16388</id>
      	<type>DIRECTORY</type>
      	<name>test2</name>
      	<mtime>1498465944002</mtime>
      	<permission>hdadmin:supergroup:rwxr-xr-x</permission>
      	<nsquota>-1</nsquota>
      	<dsquota>-1</dsquota>
      </inode>

      Next, I am going to do one more transaction, that is I want to copy a file called Arrays.java file on HDFS. The total file size is 287 MB. Please consider 128 MB as default block size for this cluster. That means, three blocks need to created on slave nodes. For first(ID:1073741828) and second block(ID:1073741829), size is taken as 128MB which is similar to default block size. And for the third block(ID:1073741830), size is taken as approximately 31 MB. So we can conclude that last block is not consuming the complete default block size.

      Copying a file is not a single transaction. Writing a file on HDFS is costly operation from day1 onwards we listening this word from “Anish” sir. It requires more no of transactions to copy a file onto HDFS. Below is the proof that why writing is very costly operation. In below XML code, there is a tag called “PATH” with value “/test1/Arrays.java._COPYING_”. Why it is represented like this instead of Arrays.java file?. Let’s say, My replication factor is three. Client have to copy the data onto datanode1, further replications will be taken care by datanodes only (i.e,. from datanode1 to datanode3, from datanode3 to datanode5). To perform these three copies on three datanodes, initially it will named as “/test1/Arrays.java._COPYING_”. You can try out loading big file (lets say 1000MB file) onto HDFS. When loading is happening, you can go and check the “http://localhost:50070/explorer.html#/&#8221; path. Then Goto /test1 directory. Under this, you will be seeing file name with “Arrays.java._COPYING_”. This file name is visible only when loading is happening. Once loading got completed, the file name is changed to our actual filename ie.., Arrays.java. Changing the file name is also one more transaction. This could be happened whether we are working either on single node or multi-node hadoop cluster. On pseudo distributed mode(ie., Single node hadoop cluster), to copy a file onto HDFS, it requires 12 transactions.

      In Memory: Below information is stored when we copying a file into HDFS.

      <inode>
      	<id>16393</id>
      	<type>FILE</type>
      	<name>Arrays.java</name>
      	<replication>1</replication>
      	<mtime>1498472753475</mtime>
      	<atime>1498472737374</atime>
      	<perferredBlockSize>134217728</perferredBlockSize>
      	<permission>hdadmin:supergroup:rw-r--r--</permission>
      	<blocks>
      		<block><id>1073741828</id><genstamp>1004</genstamp><numBytes>134217728
      </numBytes></block>
      		<block><id>1073741829</id><genstamp>1005</genstamp><numBytes>134217728
      </numBytes></block>
      		<block><id>1073741830</id><genstamp>1006</genstamp><numBytes>31906022
      </numBytes></block>
      	</blocks>
      </inode>

      In summary, in memory contains below “File to Block Mapping” metadata information:
      1) Type, whether it is file or directory
      2) Name of the file
      3) Replication factor
      3) File creation time
      4) Default block size
      5) Permission: Owner, group, permission of the file
      6) Blocks: Block ID, generated timestamp, block length.

      Based on the above information in the fsimage, it doesn’t contain “Block to datanodes” or “Bit Map” mapping metadata. As i said Block to Datanodes mapping metadata information will be re-constructed when namenode restart, bootup, ..etc.

      In edits_inprogress_xxxxxx file: Here the transaction id is 3.

      <RECORD>
      	<OPCODE>OP_ADD</OPCODE>
      	<DATA>
      	<TXID>3</TXID>
      	<LENGTH>0</LENGTH>
      	<INODEID>16393</INODEID>
      	<PATH>/test/Arrays.java._COPYING_</PATH>
      	<REPLICATION>1</REPLICATION>
      	<MTIME>1498472737374</MTIME>
      	<ATIME>1498472737374</ATIME>
      	<BLOCKSIZE>134217728</BLOCKSIZE>
      	<CLIENT_NAME>DFSClient_NONMAPREDUCE_945043468_1</CLIENT_NAME>
      	<CLIENT_MACHINE>127.0.0.1</CLIENT_MACHINE>
      	<OVERWRITE>true</OVERWRITE>
      	<PERMISSION_STATUS>
      	<USERNAME>hdadmin</USERNAME>
      	<GROUPNAME>supergroup</GROUPNAME>
      	<MODE>420</MODE>
      	</PERMISSION_STATUS>
      	<RPC_CLIENTID>f2fbdf0a-d394-4643-b564-889d93ff30ef</RPC_CLIENTID>
      	<RPC_CALLID>3</RPC_CALLID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_ALLOCATE_BLOCK_ID</OPCODE>
      	<DATA>
      	<TXID>4</TXID>
      	<BLOCK_ID>1073741828</BLOCK_ID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_SET_GENSTAMP_V2</OPCODE>
      	<DATA>
      	<TXID>5</TXID>
      	<GENSTAMPV2>1004</GENSTAMPV2>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_ADD_BLOCK</OPCODE>
      	<DATA>
      	<TXID>6</TXID>
      	<PATH>/test/Arrays.java._COPYING_</PATH>
      	<BLOCK>
      	<BLOCK_ID>1073741828</BLOCK_ID>
      	<NUM_BYTES>0</NUM_BYTES>
      	<GENSTAMP>1004</GENSTAMP>
      	</BLOCK>
      	<RPC_CLIENTID/>
      	<RPC_CALLID>-2</RPC_CALLID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_ALLOCATE_BLOCK_ID</OPCODE>
      	<DATA>
      	<TXID>7</TXID>
      	<BLOCK_ID>1073741829</BLOCK_ID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_SET_GENSTAMP_V2</OPCODE>
      	<DATA>
      	<TXID>8</TXID>
      	<GENSTAMPV2>1005</GENSTAMPV2>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_ADD_BLOCK</OPCODE>
      	<DATA>
      	<TXID>9</TXID>
      	<PATH>/test/Arrays.java._COPYING_</PATH>
      	<BLOCK>
      	<BLOCK_ID>1073741828</BLOCK_ID>
      	<NUM_BYTES>134217728</NUM_BYTES>
      	<GENSTAMP>1004</GENSTAMP>
      	</BLOCK>
      	<BLOCK>
      	<BLOCK_ID>1073741829</BLOCK_ID>
      	<NUM_BYTES>0</NUM_BYTES>
      	<GENSTAMP>1005</GENSTAMP>
      	</BLOCK>
      	<RPC_CLIENTID/>
      	<RPC_CALLID>-2</RPC_CALLID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_ALLOCATE_BLOCK_ID</OPCODE>
      	<DATA>
      	<TXID>10</TXID>
      	<BLOCK_ID>1073741830</BLOCK_ID>
      	</DATA>
      	</RECORD

      <RECORD>
      <OPCODE>OP_SET_GENSTAMP_V2</OPCODE>
      <DATA>
      <TXID>11</TXID>
      <GENSTAMPV2>1006</GENSTAMPV2>
      </DATA>
      </RECORD>`

      <RECORD>
      	<OPCODE>OP_ADD_BLOCK</OPCODE>
      	<DATA>
      	<TXID>12</TXID>
      	<PATH>/test/Arrays.java._COPYING_</PATH>
      	<BLOCK>
      	<BLOCK_ID>1073741829</BLOCK_ID>
      	<NUM_BYTES>134217728</NUM_BYTES>
      	<GENSTAMP>1005</GENSTAMP>
      	</BLOCK>
      	<BLOCK>
      	<BLOCK_ID>1073741830</BLOCK_ID>
      	<NUM_BYTES>0</NUM_BYTES>
      	<GENSTAMP>1006</GENSTAMP>
      	</BLOCK>
      	<RPC_CLIENTID/>
      	<RPC_CALLID>-2</RPC_CALLID>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_CLOSE</OPCODE>
      	<DATA>
      	<TXID>13</TXID>
      	<LENGTH>0</LENGTH>
      	<INODEID>0</INODEID>
      	<PATH>/test/Arrays.java._COPYING_</PATH>
      	<REPLICATION>1</REPLICATION>
      	<MTIME>1498472753475</MTIME>
      	<ATIME>1498472737374</ATIME>
      	<BLOCKSIZE>134217728</BLOCKSIZE>
      	<CLIENT_NAME/>
      	<CLIENT_MACHINE/>
      	<OVERWRITE>false</OVERWRITE>
      	<BLOCK>
      	<BLOCK_ID>1073741828</BLOCK_ID>
      	<NUM_BYTES>134217728</NUM_BYTES>
      	<GENSTAMP>1004</GENSTAMP>
      	</BLOCK>
      	<BLOCK>
      	<BLOCK_ID>1073741829</BLOCK_ID>
      	<NUM_BYTES>134217728</NUM_BYTES>
      	<GENSTAMP>1005</GENSTAMP>
      	</BLOCK>
      	<BLOCK>
      	<BLOCK_ID>1073741830</BLOCK_ID>
      	<NUM_BYTES>31906022</NUM_BYTES>
      	<GENSTAMP>1006</GENSTAMP>
      	</BLOCK>
      	<PERMISSION_STATUS>
      	<USERNAME>hdadmin</USERNAME>
      	<GROUPNAME>supergroup</GROUPNAME>
      	<MODE>420</MODE>
      	</PERMISSION_STATUS>
      	</DATA>
      </RECORD>
      <RECORD>
      	<OPCODE>OP_RENAME_OLD</OPCODE>
      	<DATA>
      	<TXID>14</TXID>
      	<LENGTH>0</LENGTH>
      	<SRC>/test/Arrays.java._COPYING_</SRC>
      	<DST>/test/Arrays.java</DST>
      	<TIMESTAMP>1498472753483</TIMESTAMP>
      	<RPC_CLIENTID>f2fbdf0a-d394-4643-b564-889d93ff30ef</RPC_CLIENTID>
      	<RPC_CALLID>10</RPC_CALLID>
      	</DATA>
      </RECORD>

      So far 14 transactions are done and are record in edits_inprogress_xxxxxx file. This file is updated at runtime , ie.. when namenode is up and running.

      when edit_xxxx_oooo file got created? .
      When new fsimage_xxxx file got creted?.
      When current edits_inprogress_xxxxxx file is removed and a new edits_inprogress_xxxxxx file is created?.

      Hadoop followed some naming conventions to name edits_inprogress_xxxxxx, fsimage_xxxx and edit_xxxx_oooo files.

      edit_xxxx_oooo: Where xxxx is the number of start transaction of the first record in this file and ooooo is the number of ended transaction of the last record in this file.

      fimsage_xxxxx : where xxxx is the number of last check-pointing transaction.

      edits_inprogress_xxxxxx: where xxxxx is the number of start transaction of the record in this file.

      What is secondarynamenode/checkpoint node:

      This node is a helper to namenode. Why we are calling secondary namenode/checkpoint node as helper node to namenode?.
      What types of files are available on secondary namenode/checkpoint node?.
      It contains only fsimage_xxxxxxx and edits_xxxxxxx_ooooooo files only. It doesn’t contains edits_inprogress_xxxxx file. As i said that after namenode is up and running. All operations or transaction information is recorded in edits_inprogress_xxxxx file on namenode. On secondary namenode/ Checkpoint node, edits_inprogress_xxxxx file is not available. So SNN/Checkpoint node can’t do operation(s) which are done by namenode. Thats why Secondary namenode/checkpoint node is called as helper node.

      Check-pointing will be done in two cases. After namenode is started, for every 1 hour or 1 million transactions got record in edits_inprogress_xxxxxx file, check-pointing will be done. We can configure these two properties by using dfs.namenode.checkpoint.period(we can set it 1 or 2 hours..etc) and dfs.namenode.checkpoint.txns (we can set no of transaction here, 10,20,..etc).

      For every one hour or one(1) million transactions got completed on namenode, a new fsimage_xxxx and edits_xxxx_oooo file will be created. This process will be done on secondary namenode/checkpoint node.Secondary namenode/Checkpoint node will create a new fsimage and this new fsimage is copied to namenode.

      Old edits_inprogress_xxxxxx file will be deleted and creates a new edits_inprogress_xxxxxx , where xxxxx is the starting trasantion of the first record in this file.

      Case 01:
      Let’s say I have completed 1 million transactions which are recorded in edits_inprogress_xxxxxx file, then at time secondarynamenode/checkpoint node will come into picture. And request the namenode for edits_inprogress_xxxxxx and last generated fsimage_xxxx files. These files are copied to secondarynamenode/checkpoint node. Then check-pointing will get started which is merging the edits_inprogress_xxxxxx and fsimage_xxxx into new fsimage_xxxx file.

      Let’s say, Checkpoint node has copied edits_inprogress_0000000000000000014(In this 30 transactions are recorded) and fsimage_0000000000000000013 files. This node will create a new fsimage with name fsimage_0000000000000000043. and a new edits logs will be created with name as edits_0000000000000000014-0000000000000000043. and also name of the edits_inprogress_xxxx will be edits_inprogress_0000000000000000044. These files will copied to namenode. As i said that we will maintain very less no of fsimage files on namenode. fsimage file size is very less when compared with edits_inprogress_xxxxxx and edits_inprogress_xxxxxx files because fsimage_xxxx file contains only file to block mapping information. But in edit_xxxx_oooo or edits_inprogress_xxxxxx files, it contains in-detail information (ie., transactions information).

      Case02:
      Let’s say Namenode started at 10:00AM, after completion of one hour, secondarynamenode/checkpoint node will take the edits_inprogress_xxxxxx file and latest fsimage_xxxx file to create a new fsimage_xxxx file.

      And also on master nodes, it contains few more files like seen_txid, VERSION files.

      seen_txid file: It contains the start transaction number of edits_inprogress_xxxxxx files. The content of seen_txid may or may be changed when you restart/boot-up/check-pointing the namenode. It depends up on whether check-pointing is done or not.

      VERSION file: This file contain below information. namespaceID is unique across the cluster. We know that till Hadoop 1.x, we can configure only one active namenode. In Hadoop 2.x, we can have multiple active namenodes to server the part of HDFS state namespace. This is nothing but a HDFS federation. All active namenodes are independent of each other. To differentiate active namenodes among the cluster, a unqiue namenspaceID is required for each and every active namenode.
      For a cluster, we maintain ID which is nothing but a clusterID.
      To indicate whether the node is a namenode or datanode. we have parameter called storageType, this parameter indicates the current node is namenode which is a master node.

      Sun Jun 25 23:43:03 PDT 2017
      namespaceID=570153348
      clusterID=CID-75fcb726-5ab1-40e1-8c7b-1871f85d8497
      cTime=0
      storageType=NAME_NODE
      blockpoolID=BP-90828125-127.0.1.1-1498414407677
      layoutVersion=-59

      In summary:
      1) Metadata is stored in memory as well fsimage_xxxx(For permanent storage) file. In in-memory, it contains old and
      current metadata information until restart of the namenode.
      2) It also stores the upcoming metadata information in memory. After namenode is started, whatever the metadata is
      coming, namenode will not be stored in fsimage_xxxx at that point of time. When this in-memory metadata
      information will be made available in fsimage_xxxx file. Checkpoint node will take care of this.

      Note:
      1) Secondary/Checkpoint node need to be configured on a separate node because it requires more memory to perform merging for creating a new fsimage from the edits and latest updated fsimage files.
      2) If In-memory crases, then we can able to recover the metadata by replaying the information available in edits_inprogress_xxxxxx file.
      3) If Namenode goes down or crashes or burnt, then we can’t ensure that my complete metadata information is available on secondarynamenode/checkpoint of fsimage_xxxxxxx and edit_xxxx_oooo files.
      4) We can able to setup a new namenode and copy the edits and fsimage files from secondarynamenode/checkpoint node. But fsimage_xxxx file is not up-to date. Because check-pointing will be happened for every 1 hour or when 1 million transactions has reached. On new namenode, we may or may not have complete metadata information.

      For 2nd,3rd and 4th point, I am referring based on Hadoop 1.x.

      When namenode goes down, complete hadoop cluster will be down. This is called single point of failure (SPOF).

    • #5289
      DataFlair TeamDataFlair Team
      Spectator

      Fsimage and edit logs are stored in Binary format. We can’t see metadata information in a readable format. To see these files in a readable format, Hadoop come with two tools. This tools will work in offline or on up and running cluster also.

      To convert fsimage file binary format to human readable format, hadoop has come up with tool called oiv.

      To convert edit log file binary format to human readable format, hadoop has come up with tool called oev.

      To convert fsimage file to human readable format(lets say XML format):
      hdfs oiv -i fsimage_0000000000000000053 -o fsimage_xmlformat.xml -p XML

      To convert edit log file to human readable format(lets say XML format):
      hdfs oev -i edits_inprogress_0000000000000000054 -o edits_xmlformat.xml -p XML

      -i : indicates the input
      -o : indicates the output

      Once the above-triggered command finishes the execution, then you can open edits_xmlformat.xml.xml/ fsimage_xmlformat.xml file and start seeing the metadata information. By seeing the metadata information available in these files, we can come up with few conclusions like, for example, Allocation of block size for the last block, ….etc.

    • #5290
      DataFlair TeamDataFlair Team
      Spectator

      Why check-pointing process need to be done by only secondary namenode/ checkpoint node?. Why not namenode?

      For suppose namenode is also taking care of check-pointing process. To perform check-pointing process, namenode should be in safe mode, that means it will place a lock on edits_inprogress_xxxx file in order to create a new fsimage file or namespace image. In this process(in safe mode), client cannot write the file or perform modification to the existing blocks on HDFS, …etc. But client can read the data from datanodes because metadata information is available in memory. So namenode can provide access to the client to read the data available on datanodes.

      Why can’t we perform write/modification operations on HDFS when namenode is in safe mode?

      As said earlier that when client want to write a file or perform modifications to an existing blocks on HDFS, first of all namenode has to write a transaction record in edits_inprogress_xxxx file, but this file is already locked for check-pointing process. Then how namenode can serve the clients write/modifications requests. That’s why namenode is not responsible to perform check-pointing process. There is secondary namenode/ checkpoint node to take care of check-pointing process.

      Along with issue, there is one issue, that is memory space. To perform check-pointing process, it requires more memory in order create a new fsimage.

      That’s why we need a separate node for secondary namenode/checkpoint node.

      From our end also, we can perform check-pointing.

      Use below command to enter into safemode.

      hdfs dfsadmin -safemode enter

      hdfs dfsadmin -saveNamespace

      When you fire above command, then the check-pointing process will get started. Before doing this, you need to bring the node to safemode. Once you are done with check-pointing, then node should leave the safemode.

      To leave the safemode

      hdfs dfsadmin -safemode leave

      With below command, you can also check whether node is which mode.

      hdfs dfsadmin -safemode get

Viewing 3 reply threads
  • You must be logged in to reply to this topic.