If I create folder will there be metadata created in Hadoop

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop If I create folder will there be metadata created in Hadoop

Viewing 2 reply threads
  • Author
    Posts
    • #4664
      DataFlair TeamDataFlair Team
      Spectator

      If I create folder in HDFS, will there be metadata created corresponding to the folder ?
      If yes what will be the size of meta data created for directory ?

    • #4665
      DataFlair TeamDataFlair Team
      Spectator

      When a folder is created on HDFS its metadata is stored on namenode only not on datamode. In order to find this below is little poc which I did.

      Small POC on this:-
      ————————

      I created a directory/folder on hdfs. Then I tried looking into namenode data directory, on the master, it has some information on it regarding Folder created:-

      ubuntu@ip-172-31-12-128:~/hdata/dfs/name/current$ ls
      edits_inprogress_0000000000000000001 fsimage_0000000000000000000.md5 VERSION
      fsimage_0000000000000000000 seen_txid

      After that, I looked into datanode, data directory on the slave machine. It doesn’t have any information on it. The directory is empty:-

      ubuntu@ip-172-31-12-129:~/hdata/dfs/data/current/BP-821020587-172.31.12.128-
      1436504482600/current/finalized$

      This means that when the directory is created its metadata is stored in namenode only.

      Further to this I created a file on hdfs I can see that now there is data under data directory of slave node:-

      ubuntu@ip-172-31-12-129:~/hdata/dfs/data/current/BP-821020587-
      172.31.12.128-1436504482600/current/finalized/subdir0/subdir0$ cat blk_1073741825
      Nitin
      Raichandani
      Nitin
      Raichandani
      Nitin
      Raichandani

      I hope above research will help.

      For more details, please follow: HDFS Tutorial

    • #4666
      DataFlair TeamDataFlair Team
      Spectator

      Yes, the metadata will be created if we create a folder in HDFS. Whenever any change occurs to the HDFS file system, it maintains the 2 types of files in its meta-data directory namely fsimage and an edit file.

      1) FsImage – An fsimage file contains the complete state of the file system at a point in time. Every file system modification is assigned a unique, monotonically increasing transaction ID. A fsimage file represents the file system state after all modifications up to a specific transaction ID.

      2) Edits – An edits file is a log that lists each file system change (file creation, deletion or modification) that was made after the most recent fsimage.

      It will approximately take around 2 MB of memory for above 2 types of files.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.