Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Why does Hadoop need classes like Text or IntWritable?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 4:39 pm #5905DataFlair TeamSpectator
In Map-Reduce Why does Hadoop need classes like Text or IntWritable instead of String or Integer? Why we cannot use default java types ?
-
September 20, 2018 at 4:39 pm #5907DataFlair TeamSpectator
In MapReduce there is special purpose datatype for key and value, e.g. instead of int IntWritable, instead of long LongWritable, instead of String Text.
Actually, these keys and values need to travel across the network (from Mapper node to Reducer node), so special purpose datatypes are created which are serialized. Since this is a physical movement of data, it must be optimized.Even we can create a custom key and value:
Key class must implement WritableComparable interface (must implement abstract methods of this interface)
Value class must implement Writable interface (must implement abstract methods of this interface) -
September 20, 2018 at 4:40 pm #5908DataFlair TeamSpectator
In order to handle the Objects in Hadoop way. For example, Hadoop uses Text instead of Java’s String. The Text class in Hadoop is similar to a Java String, however, Text implements interfaces like Comparable, Writable and WritableComparable.
These interfaces are all necessary for MapReduce; the Comparable interface is used for comparing when the reducer sorts the keys, and Writable can write the result to the local disk. It does not use the Java Serializable because java Serializable is too big or too heavy for Hadoop, Writable can serializable the Hadoop Object in a very light way.
-
-
AuthorPosts
- You must be logged in to reply to this topic.