Explain Clustering in Hive?

Viewing 1 reply thread
  • Author
    Posts
    • #4752
      DataFlair TeamDataFlair Team
      Spectator

      What is Clustering in Hive?

    • #4754
      DataFlair TeamDataFlair Team
      Spectator

      In order to decompose table data sets into more manageable parts, Bucketing and Clustering is the process in Hive.

      Basically, the concept of bucketing is based on HashFunction(Bucketing column) mod No.of Buckets. Moreover, by this HashFunction, the bucket number is found. And, while creating a bucket table, no. of buckets is mentioned.

      In addition, the table is divided into the number of partitions, and further these partitions are subdivided into more manageable parts which we call Buckets/Clusters.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.