Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Apache HIve
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 12:31 pm #4865DataFlair TeamSpectator
How Replication Works in Hive.Suppose we have a text file in HDFS and we create a table in hive and load that text file in newly created table,then the table will replicate 3 times to three nodes??(if replication factor is 3)
-
September 20, 2018 at 12:31 pm #4867DataFlair TeamSpectator
Hive Replication
Basically, in order to copy (replicate) our Hive metastore as well as data from one cluster to another and also to keep the Hive metastore and data set on the target cluster synchronized with the source based on a user-specified replication schedule we use Hive replication.
As the number of replicas is based on the replication factor set, in HDFS. Though the replication factor is 3, in your case. so, there will be three copies.
The data is copied only from one location on hdfs to a table in hive, when you do a sqoop import from hdfs to hive(into internal table). But the replication of Hive data again happens based on our replication factor.
Ultimately, as hive doesn’t store data in the same file format), so, in total, you will end up with 3(hdfs) + 1(hive copy)*3 => 3copies on HDFS and 3 copies of data stored by hive(this is not 6 copies.
-
-
AuthorPosts
- You must be logged in to reply to this topic.