Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › How to handle stucked process and improve performance?
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 12:21 pm #4813DataFlair TeamSpectator
If we are loading data in hive and process has been stucked or taking very long time so how we are going to debug it to increase the performance?
-
September 20, 2018 at 12:21 pm #4815DataFlair TeamSpectator
There are many Hive optimization techniques which help to improve the performance of its process that has been stacked or taking a very long time, that is:
Usage of Suitable File Format in Hive
ORCFILE File Formate – Hive Optimization Techniques, on the basis of data, if we use an appropriate file format that drastically increases our query performance. Hence, ORC file format is best suitable for increasing your query performance. ORC is an acronym for Optimized Row Columnar. It means, it is possible to store data in an optimized way than the other file formats, in this format.
Basically, this format decreases the size of the original data up to 75%. Thus, the speed of data processing also increases. Though, we can say ORC shows better performance, in comparison with Text, Sequence and RC file formats. Simply, it contains rows of data in groups. For example, Stripes along with a file footer. Hence, we can say ORC format improves the performance in Hiveprocessing.
However, there are many more Hive optimization techniques to improve its performance, to learn all, follow the link: 7 Best Hive Optimization Techniques – Hive Performance
-
-
AuthorPosts
- You must be logged in to reply to this topic.