Hadoop distributed file system (HDFS) is a system that stores very large dataset. As it is the most important component of Hadoop Architecture so it is the most important topic for an interview. In this blog, we provide the 50+ HDFS interview questions and answers that are being framed by our company expert who provides training in Hadoop and another Bigdata framework. A Proper care has been taken while answering these questions. So we can provide you the best question and their answer. Hope these question will help you to crack Hadoop interview. All the best!!!!!!!
50+ Best HDFS Interview Questions And Answers
1) What is Hadoop?
2) What is Hadoop Distributed File System- HDFS?
3) What is NameNode and DataNode in HDFS?
4) How NameNode tackle Datanode failures in HDFS?
5) What do you mean by metadata in Hadoop?
6) In which location NameNode stores its Metadata? And why?
7) How much Metadata will be created on NameNode in Hadoop?
8) When NameNode enter in Safe Mode?
9) How to restart NameNode or all the daemons in Hadoop HDFS?
10) What are the modes in which Apache Hadoop run?
11) On what basis name node distribute blocks across the data nodes in HDFS?
12) What is a block in HDFS, why block size 64MB?
13) Why is block size large in Hadoop?
14) What is Fault Tolerance in Hadoop HDFS?
15) Why is block size set to 128 MB in HDFS?
16) What happens if the block on Hadoop HDFS is corrupted?
17) What is the difference between NameNode and DataNode in Hadoop?
18) How data or file is read in Hadoop HDFS?
19) How data or file is written into Hadoop HDFS?
20) Ideally what should be the block size in Hadoop?
21) What is Heartbeat in Hadoop?
22) How often DataNode send heartbeat to NameNode in Hadoop?
23) While starting Hadoop services, DataNode service is not running?
24) How HDFS helps NameNode in scaling in Hadoop?
25) What is Secondary NameNode in Hadoop HDFS?
26) Ideally what should be the replication factor in Hadoop?
27) How one can change Replication factor when Data is already stored in HDFS
28) Why HDFS performs replication, although it results in data redundancy in Hadoop?
29) What is Safemode in Apache Hadoop?
30) What happen when namenode enters in safemode in hadoop?
31) How to remove safemode of namenode forcefully in HDFS?
32) How to create the directory when Name node is in safe mode?
33) Why can we not create directory /user/dataflair/inpdata001 when Name node is in safe mode?
34) What is difference between a MapReduce InputSplit and HDFS block
35) Explain Small File Problem in Hadoop
36) What is the difference between HDFS and NAS?
37) How to create Users in hadoop HDFS?
38) What Happens When NameNode Goes down during File Read Operation in Hadoop?
39) Explain HDFS “Write once Read many” pattern
39) Can multiple clients write into an HDFS file concurrently in hadoop?
40) Does HDFS allow a client to read a file which is already opened for writing in hadoop?
41) What should be the HDFS Block size to get maximum performance from Hadoop cluster?
42) Why HDFS stores data using commodity hardware despite the higher chance of failures in hadoop?
43) Who divides the file into Block while storing inside hdfs in hadoop?
44) What is active and passive NameNode in HDFS?
45) How is indexing done in hadoop HDFS?
46) What is rack awareness in Hadoop?
47) What is Erasure Coding in Hadoop?
48) When and how to create hadoop archive?
49) What is non-dfs used in HDFS web console
50) How does HDFS ensure Data Integrity of data blocks stored in Hadoop HDFS?
51) Why slaves limited to 4000 in Hadoop Version1?
If in case you feel any query in HDFS interview questions and answers, so leave a comment in a section given below.