Difference between Int and IntWritable

Viewing 1 reply thread
  • Author
    Posts
    • #4703
      DataFlair TeamDataFlair Team
      Spectator

      Why we cannot use Java’s native int or Integer in Hadoop as key or Value ?
      What is the difference between Int and IntWritable ?

    • #4705
      DataFlair TeamDataFlair Team
      Spectator

      To explain this I would like to talk about a few key terms Comparable, Writable and WritableComparable. All these are the interfaces which all the classes in
      org.apache.hadoop.io implement
      Comparable is the interface whose abstract methods give us the flexibility to compare two objects.
      Writable is meant for writing the data to local disk and it’s a serialization format. One can implement own Writables in Hadoop. Java’s serialization is too bulky and slow on the system. That’s why Hadoop community had put Writable in place.
      WritableComparable is a combination of the above two interfaces.

      int is a primitive type so it cannot be used as key-value. Integer is the wrapper class around it. So I’ll correct your question that what is the difference between Integer and IntWritable?

      IntWritable is the Hadoop variant of Integer which has been optimized for serialization in the Hadoop environment. An integer would use the default Java Serialization which is very costly in Hadoop environment.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.