Why does Hadoop need classes like Text or IntWritable?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Why does Hadoop need classes like Text or IntWritable?

Viewing 2 reply threads
  • Author
    Posts
    • #5905
      DataFlair TeamDataFlair Team
      Spectator

      In Map-Reduce Why does Hadoop need classes like Text or IntWritable instead of String or Integer? Why we cannot use default java types ?

    • #5907
      DataFlair TeamDataFlair Team
      Spectator

      In MapReduce there is special purpose datatype for key and value, e.g. instead of int IntWritable, instead of long LongWritable, instead of String Text.
      Actually, these keys and values need to travel across the network (from Mapper node to Reducer node), so special purpose datatypes are created which are serialized. Since this is a physical movement of data, it must be optimized.

      Even we can create a custom key and value:
      Key class must implement WritableComparable interface (must implement abstract methods of this interface)
      Value class must implement Writable interface (must implement abstract methods of this interface)

    • #5908
      DataFlair TeamDataFlair Team
      Spectator

      In order to handle the Objects in Hadoop way. For example, Hadoop uses Text instead of Java’s String. The Text class in Hadoop is similar to a Java String, however, Text implements interfaces like Comparable, Writable and WritableComparable.

      These interfaces are all necessary for MapReduce; the Comparable interface is used for comparing when the reducer sorts the keys, and Writable can write the result to the local disk. It does not use the Java Serializable because java Serializable is too big or too heavy for Hadoop, Writable can serializable the Hadoop Object in a very light way.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.