new Jvm instead of a new java thread?

Viewing 3 reply threads
  • Author
    Posts
    • #4622
      DataFlair TeamDataFlair Team
      Spectator

      Why new mapper or reducer task is started in separate JVM ? Why does Hadoop launches new jvm instead of a new thread when it launches a new task (either a mapper or reducer)? Why Mapper or Reducer is launched as heavy weight process rather then light weight thread ?

    • #4623
      DataFlair TeamDataFlair Team
      Spectator

      Firstly, Map reduce algorithm is built for the distributed processing systems.Threads are the subprocesses and inside the process, they do not lie outside the boundary of OS i.e. a single machine.Thus on each new machine, a process is launched instead of thread.
      Finally, the thread shares data, variables, and resources while map reduce works on the different chunks of data. These data are distributed over the cluster and makes it difficult to initiate threading over the nodes of different datasets.

    • #4624
      DataFlair TeamDataFlair Team
      Spectator

      MapReduce framework uses one mapper for one block and the data in the block is processes sequentially line by line. The native framework doesn’t provide a run() method for running mapper in a multi-threaded environment. It’s hard work overriding the native run() method and creates a multi-threaded mapper (but it can be done). However, the management, control and heartbeat reporting will be more complex.

    • #4625
      DataFlair TeamDataFlair Team
      Spectator

      The MapReduce framework provides failsafe execution. To allow this, it is better to launch each task in its own JVM rather than in a thread. When a task fails, the same task can be started in its own environment. This provides a better management, control, and reporting of each task. Managing threads is relatively more complex. If a thread hangs then it needs to be killed and the task would have to start from where it left. The creation of new JVM avoids all of this overhead.

Viewing 3 reply threads
  • You must be logged in to reply to this topic.