What is a “Distributed Cache” in Apache Hadoop?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is a “Distributed Cache” in Apache Hadoop?

Viewing 2 reply threads
  • Author
    Posts
    • #5433
      DataFlair TeamDataFlair Team
      Spectator

      Explain the “Distributed Cache” in MapReduce Framework?
      What is the need of distributed cache in Hadoop?

    • #5434
      DataFlair TeamDataFlair Team
      Spectator

      Distributed Cache is a facility provided by the Map-Reduce framework to cache small files(kilobytes or few megabytes in size) needed by application.The files can be jars, text, archives etc.
      Once you cache a file for your job, Hadoop framework will make it available on each and every data nodes (in file system, not in memory) where you map/reduce tasks are running. Thus, we can access files from all the datanode in our map/reduce job.
      We can control the size of the distributed cache size property in mapped-site.xml.

      The Benefit of using distributed cache is it minimizes network data transfer. It also tracks the modification time stamp of cache files.and notifies that the files should not be changed until the job is executing.

      Follow the link to learn more about DistributedCache in Hadoop

    • #5436
      DataFlair TeamDataFlair Team
      Spectator

      DistributedCache is a mechanism supported by Map-Reduce framework where some files to be shared across all data nodes in Hadoop Cluster to use them when map/reduce tasks are running. It can be simple properties file or can be executable jar file.
      These files are stored locally on every Data node.The distributed cache can contain small data files .
      After successful run of the job, the distributed cache files (these are temporary files) will be deleted from Slave nodes.

      By default, cache size is 10GB. If you want more memory to configure local.cache.size in mapred-site.xml .

      Follow the link to learn more about DistributedCache in Hadoop

Viewing 2 reply threads
  • You must be logged in to reply to this topic.