How much Metadata will be created on NameNode

Viewing 1 reply thread
  • Author
    • #4642

      Corresponding to a file of 1GB, how much Metadata Name Node will store. Block-size, Replication factor will be default.

    • #4644

      NameNode Metadata stores the file to Block mapping, locations of blocks on DataNodes, active data nodes, a bunch of other metadata is all stored in memory on the NameNode. When we check the NameNode status website, pretty much all of that information is stored in memory somewhere.

      The only thing stored on disk is the fsimage, edit log, and status logs. NameNode never really uses these files on disk, except for when it starts. The fsimage and edits file pretty much only exist to be able to bring the NameNode back up if it needs to be stopped or it crashes.

      1) fsimage – An fsimage file contains the complete state of the file system at a point in time. Every file system modification is assigned a unique, monotonically increasing transaction ID. An fsimage file represents the file system state after all modifications up to a specific transaction ID.

      2) edits – An edits file is a log that lists each file system change (file creation, deletion or modification) that was made after the most recent fsimage.

      When a file is put into HDFS, it is split into blocks (of configurable size).
      Let’s say we have a file called “file.txt” that is 1GB (1000MB) and our block size is 128MB. We will end up with 7 128MB blocks and a 104MB block. The NameNode keeps track of the fact that “file.txt” in HDFS maps to these eight blocks and three replicas of each block. DataNodes store blocks, not files, so the mapping is important to understanding where our data is and what our data is.

      Corresponding to a block 150 bytes (roughly) of metadata is created, Since there are 8 blocks with replication factor 3 i.e. 24 blocks. Hence 150×24 = 3600 bytes of metadata will be created.

      On disk, the NameNode stores the metadata for the file system. This includes file and directory permissions, ownerships, and assigned blocks in the fsimage and the edit logs. In properly configured setups, it also includes a list of DataNodes that make up the HDFS (dfs.include parameter) and DataNodes that are to be removed from

      that list (dfs.exclude parameter). Note that which DataNodes have which blocks is only stored in memory and not on disk.
      For more details, please follow:

Viewing 1 reply thread
  • You must be logged in to reply to this topic.