

{"id":425,"date":"2016-06-13T13:52:45","date_gmt":"2016-06-13T13:52:45","guid":{"rendered":"http:\/\/data-flair.training\/blogs\/?p=425"},"modified":"2021-08-25T22:34:13","modified_gmt":"2021-08-25T17:04:13","slug":"hadoop-hdfs-data-read-and-write-operations","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/","title":{"rendered":"Hadoop HDFS Data Read and Write Operations"},"content":{"rendered":"<h2>1. Objective<\/h2>\n<p><strong>HDFS<\/strong> follow <strong><em>Write once Read many<\/em> <\/strong>models. So we cannot edit files already stored in HDFS, but we can append data by reopening the file. In Read-Write operation client first, interact with the <strong><a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-namenode-high-availability\/\">NameNode<\/a><\/strong>. NameNode provides privileges so, the client can easily read and write data blocks into\/from the respective datanodes.\u00a0In this blog, we will discuss the internals of <a href=\"http:\/\/data-flair.training\/blogs\/hadoop-introduction-tutorial-quick-guide\/\"><strong>Hadoop<\/strong> <\/a><strong>HDFS<\/strong> data read and write operations. We will also cover how client read and write the data from <strong><a href=\"https:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\">HDFS<\/a><\/strong>, how the client interacts with master and slave nodes in HDFS data read and write operations.<\/p>\n<div id=\"attachment_30491\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-30491\" class=\"wp-image-30491 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg\" alt=\"Hadoop HDFS Data Read and Write Operations\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-30491\" class=\"wp-caption-text\">Hadoop HDFS Data Read and Write Operations<\/p><\/div>\n<p>This blog also contains the videos to deeply understand the internals of HDFS file read and write operations.<\/p>\n<h2>2. Hadoop HDFS Data Read and Write Operations<\/h2>\n<p><a href=\"http:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\"><strong>HDFS<\/strong> \u2013 <strong><em>Hadoop Distributed File System<\/em> <\/strong><\/a>is the storage layer of <strong>Hadoop<\/strong>. It is most reliable storage system on the planet. HDFS works in <em>master-slave<\/em> fashion, <strong>NameNode<\/strong> is the master daemon which runs on the master node, <strong>DataNode<\/strong> is the slave daemon which runs on the slave node.<br \/>\nBefore start using with HDFS, you should install Hadoop. I recommend you-<\/p>\n<ul>\n<li><strong><a href=\"http:\/\/data-flair.training\/blogs\/install-hadoop-on-single-machine\/\">Hadoop installation on a single node<\/a><\/strong><\/li>\n<li><strong><a href=\"http:\/\/data-flair.training\/blogs\/install-configure-apache-hadoop-2-7-x-on-ubuntu\/\">Hadoop installation on Multi-node cluster<\/a><\/strong><\/li>\n<\/ul>\n<p>Here, we are going to cover the HDFS data read and write operations. Let&#8217;s discuss HDFS file write operation first followed by HDFS file read operation-<\/p>\n<h3>2.1. Hadoop HDFS Data Write Operation<\/h3>\n<p>To <strong><a href=\"http:\/\/data-flair.training\/blogs\/hdfs-data-write-operation\/\">write a file in HDFS<\/a><\/strong>, a client needs to interact with master i.e. <strong>namenode\u00a0<\/strong>(master). Now namenode provides the address of the <strong>datanodes<\/strong> (slaves) on which client will start writing the data. Client directly writes data on the datanodes, now datanode will create data write pipeline.<br \/>\nThe first datanode will copy the block to another datanode, which intern copy it to the third datanode. Once it creates the replicas of blocks, it sends back the acknowledgment.<\/p>\n<h4>a. HDFS Data Write Pipeline Workflow<\/h4>\n<p><span style=\"font-size: 16px\">Now let\u2019s understand complete end to end <\/span><strong style=\"font-size: 16px\"><a href=\"https:\/\/data-flair.training\/blogs\/comprehensive-hdfs-guide-introduction-architecture-data-read-write-tutorial\/\">HDFS <\/a><\/strong><span style=\"font-size: 16px\">data write pipeline. As shown in the above figure the data write operation in HDFS is distributed, client copies the data distributedly on <\/span>datanodes<span style=\"font-size: 16px\">, the steps by step explanation of data write operation is:<\/span><b>i)\u00a0<\/b>The HDFS client sends a <strong>create<\/strong> request on<em> DistributedFileSystem<\/em> APIs.<br \/>\n<b>ii)\u00a0<\/b><em>DistributedFileSystem<\/em> makes an RPC call to the namenode to create a new file in the file system&#8217;s namespace.<\/p>\n<p>The namenode performs various checks to make sure that the file doesn\u2019t already exist and that the client has the permissions to create the file. When these checks pass, then only the namenode makes a record of the new file; otherwise, file creation fails and the client is thrown an <em>IOException. <\/em>Also Learn <strong><a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-architecture\/\">Hadoop HDFS Architecture<\/a><\/strong> in Detail.<\/p>\n<p><b>iii)\u00a0<\/b>The <em>DistributedFileSystem<\/em> returns a <em>FSDataOutputStream<\/em> for the client to start writing data to. As the client writes data, <em>DFSOutputStream<\/em> splits it into packets, which it writes to an internal queue, called the data queue. The data queue is consumed by the <em>DataStreamer, which<\/em>I is responsible for asking the namenode to allocate new <a href=\"http:\/\/data-flair.training\/blogs\/data-blocks-hdfs-hadoop-distributed-file-system\/\"><strong>blocks<\/strong><\/a> by picking a list of suitable datanodes to store the replicas.<\/p>\n<p><b>iv)\u00a0<\/b>The list of datanodes form a pipeline, and here we\u2019ll assume the replication level is three, so there are three nodes in the pipeline. The <em>DataStreamer<\/em> streams the packets to the first datanode in the pipeline, which stores the packet and forwards it to the second datanode in the pipeline. Similarly, the second datanode stores the packet and forwards it to the third (and last) datanode in the pipeline. Learn <a href=\"https:\/\/data-flair.training\/blogs\/data-blocks-hdfs-hadoop-distributed-file-system\/\"><strong>HDFS Data blocks in detail<\/strong>.<\/a><\/p>\n<p><b>v)\u00a0<\/b><em>DFSOutputStream<\/em> also maintains an internal queue of packets that are waiting to be acknowledged by datanodes, called the <em>ack queue<\/em>. A packet is removed from the ack queue only when it has been acknowledged by the datanodes in the pipeline. Datanode sends the acknowledgment once required replicas are created (3 by default). Similarly, all the blocks are stored and replicated on the different datanodes, the data blocks are copied in parallel.<\/p>\n<p><b>vi)\u00a0<\/b>When the client has finished writing data, it calls <strong>close()<\/strong> on the stream.<\/p>\n<p><b>vii)\u00a0<\/b>This action flushes all the remaining packets to the datanode pipeline and waits for acknowledgments before contacting the namenode to signal that the file is complete. The namenode already knows which blocks the file is made up of, so it only has to wait for blocks to be minimally replicated before returning successfully.<\/p>\n<p><strong>Learn:<a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/\"> Hadoop HDFS Data Read and Write Operations<\/a><\/strong><br \/>\n<em>We can summarize the HDFS data write operation from the following diagram:<\/em><\/p>\n<div id=\"attachment_38\" style=\"width: 970px\" class=\"wp-caption alignnone\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/05\/Data-Write-Mechanism-in-HDFS.gif\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-38\" class=\"wp-image-38 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/05\/Data-Write-Mechanism-in-HDFS.gif\" alt=\"Data Write Mechanism in HDFS Tutorial\" width=\"960\" height=\"541\" \/><\/a><p id=\"caption-attachment-38\" class=\"wp-caption-text\">Data Write Mechanism in HDFS Tutorial<\/p><\/div>\n<h4>b. How to Write a file in HDFS &#8211; Java Program<\/h4>\n<p>A sample code to write a file to HDFS in Java is as follows (To interact with HDFS and perform various operations<strong> <a href=\"http:\/\/data-flair.training\/blogs\/top-hdfs-commands-tutorial\/\">follow this HDFS command part &#8211; 1<\/a><\/strong>):<\/p>\n<p>[php]FileSystem fileSystem = FileSystem.get(conf);<br \/>\n\/\/ Check if the file already exists<br \/>\nPath path = new Path(&#8220;\/path\/to\/file.ext&#8221;);<br \/>\nif (fileSystem.exists(path)) {<br \/>\nSystem.out.println(&#8220;File &#8221; + dest + &#8221; already exists&#8221;);<br \/>\nreturn;<br \/>\n}<br \/>\n\/\/ Create a new file and write data to it.<br \/>\nFSDataOutputStream out = fileSystem.create(path);<br \/>\nInputStream in = new BufferedInputStream(new FileInputStream(<br \/>\nnew File(source)));<br \/>\nbyte[] b = new byte[1024];<br \/>\nint numBytes = 0;<br \/>\nwhile ((numBytes = in.read(b)) &gt; 0) {<br \/>\nout.write(b, 0, numBytes);<br \/>\n}<br \/>\n\/\/ Close all the file descripters<br \/>\nin.close();<br \/>\nout.close();<br \/>\nfileSystem.close();<br \/>\n[\/php]<\/p>\n<p class=\"entry-title \"><strong><a href=\"https:\/\/data-flair.training\/blogs\/most-used-hdfs-commands-tutorial-examples\/\">Hadoop HDFS Commands with Examples and Usage \u2013 Part II<\/a><\/strong><\/p>\n<h3>2.2. Hadoop HDFS Data Read Operation<\/h3>\n<p><b><\/b>To <strong><a href=\"http:\/\/data-flair.training\/blogs\/hdfs-data-read-operation\/\">read a file from HDFS<\/a><\/strong>, a client needs to interact with namenode (master) as namenode is the centerpiece of <strong><a href=\"http:\/\/data-flair.training\/blogs\/install-configure-apache-hadoop-2-7-x-on-ubuntu\/\">Hadoop cluster<\/a><\/strong> (it stores all the metadata i.e. data about the data). Now namenode checks for required privileges, if the client has sufficient privileges then namenode provides the address of the slaves where a file is stored. Now client will interact directly with the respective datanodes to read the data blocks.<\/p>\n<h4>a. HDFS File Read Workflow<\/h4>\n<p>Now let\u2019s understand complete end to end HDFS data read operation. As shown in the above figure the data read operation in HDFS is distributed, the client reads the data parallelly from datanodes, the steps by step explanation of data read cycle is:<\/p>\n<div id=\"attachment_33\" style=\"width: 970px\" class=\"wp-caption aligncenter\"><a href=\"http:\/\/data-flair.training\/blogs\/wp-content\/uploads\/Data-Read-Mechanism-in-HDFS.gif\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-33\" class=\"wp-image-33 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/05\/Data-Read-Mechanism-in-HDFS.gif\" alt=\"HDFS data read and write operations\" width=\"960\" height=\"541\" \/><\/a><p id=\"caption-attachment-33\" class=\"wp-caption-text\">HDFS data read and write operations<\/p><\/div>\n<p><strong>i)<\/strong> Client opens the file it wishes to read by calling <strong>open()<\/strong> on the<em> FileSystem<\/em> object, which for HDFS is an instance of <em>DistributedFileSystem<\/em>. See <strong><a href=\"https:\/\/data-flair.training\/blogs\/hdfs-data-read-operation\/\">Data Read Operation<\/a><\/strong> in HDFS<br \/>\n<strong>ii)<\/strong> <em>DistributedFileSystem<\/em> calls the namenode using RPC to determine the locations of the blocks for the first few blocks in the file. For each block, the namenode returns the addresses of the datanodes that have a copy of that<strong><a href=\"https:\/\/data-flair.training\/blogs\/data-blocks-hdfs-hadoop-distributed-file-system\/\"> block<\/a><\/strong> and datanode are sorted according to their proximity to the client.<br \/>\n<strong>iii)<\/strong><em> DistributedFileSystem<\/em> returns a <em>FSDataInputStream<\/em> to the client for it to read data from. <em>FSDataInputStream<\/em>, thus, wraps the <em>DFSInputStream<\/em> which manages the datanode and namenode I\/O. Client calls <strong>read()<\/strong> on the stream. DFSInputStream which has stored the datanode addresses then connects to the closest datanode for the first block in the file.<br \/>\n<strong>iv)<\/strong> Data is streamed from the datanode back to the client, as a result client can call <strong>read()<\/strong> repeatedly on the stream. When the block ends, DFSInputStream will close the connection to the datanode and then finds the best datanode for the next block. Also learn about<strong><a href=\"https:\/\/data-flair.training\/blogs\/hdfs-data-write-operation\/\"> Data write operation <\/a><\/strong>in HDFS<br \/>\n<b>v)\u00a0<\/b>If the <em>DFSInputStream<\/em> encounters an error while communicating with a datanode, it will try the next closest one for that block. It will also remember datanodes that have failed so that it doesn\u2019t needlessly retry them for later blocks. The <em>DFSInputStream<\/em> also verifies checksums for the data transferred to it from the datanode. If it finds a corrupt block, it reports this to the namenode before the<em> DFSInputStream<\/em> attempts to read a replica of the block from another datanode.<b>vi)\u00a0<\/b>When the client has finished reading the data, it calls <strong>close()<\/strong> on the stream.<br \/>\n<em>We can summarize the HDFS data read operation from the following diagram:<\/em><\/p>\n<h4>b. How to Read a file from HDFS &#8211; Java Program<\/h4>\n<p>A sample code to read a file from HDFS is as follows (To perform HDFS read and write operations\u00a0<a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-commands-tutorial\/\"><strong>follow this HDFS command part &#8211; 3<\/strong><\/a>):<br \/>\n[php]FileSystem fileSystem = FileSystem.get(conf);<br \/>\nPath path = new Path(&#8220;\/path\/to\/file.ext&#8221;);<br \/>\nif (!fileSystem.exists(path)) {<br \/>\nSystem.out.println(&#8220;File does not exists&#8221;);<br \/>\nreturn;<br \/>\n}<br \/>\nFSDataInputStream in = fileSystem.open(path);<br \/>\nint numBytes = 0;<br \/>\nwhile ((numBytes = in.read(b))&gt; 0) {<br \/>\nSystem.out.prinln((char)numBytes));\/\/ code to manipulate the data which is read<br \/>\n}<br \/>\nin.close();<br \/>\nout.close();<br \/>\nfileSystem.close();[\/php]<\/p>\n<h2>3. Fault Tolerance in HDFS<\/h2>\n<p>As we have discussed HDFS data read and write operations in detail, Now, what happens when one of the machines i.e. part of the pipeline which has a datanode process running fails. Hadoop has an inbuilt functionality to handle this scenario (<strong><a href=\"http:\/\/data-flair.training\/blogs\/learn-hadoop-hdfs-fault-tolerance\/\">HDFS is fault tolerant<\/a><\/strong>). When a datanode fails while data is being written to it, then the following actions are taken, which are transparent to the client writing the data.<\/p>\n<ul>\n<li>First, the pipeline is closed, and any packets in the ack queue are added to the front of the data queue so that datanode that are downstream from the failed node will not miss any packets.<\/li>\n<li>The current block on the good datanode is given a new identity, which is communicated to the namenode so that the partial block on the failed datanode will be deleted if the failed datanode recovery later on. Also read<strong><a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-namenode-high-availability\/\"> Namenode high availability<\/a><\/strong> in HDFS<\/li>\n<li>The datanode that fails is removed from the pipeline, and then the remainder of the block\u2019s data is written to the two good datanodes in the pipeline.<\/li>\n<li>The namenode notices that the block is under-replicated, and it arranges for a further replica to be created on another node. Then it treats the subsequent blocks as normal.<\/li>\n<\/ul>\n<p>It\u2019s possible, but unlikely, that multiple datanodes fail while client writes a block. As long as it writes\u00a0<em>dfs.replication.min<\/em> replicas (<strong>which default to<\/strong> <strong>1<\/strong>), the write will be successful, and the block will asynchronously replicate across the cluster until it achieves target replication factor \u00a0(<em>dfs.replication<\/em>,<strong> which defaults to<\/strong> <strong>3<\/strong>).<br \/>\nFailover process is same as data write operation, data read operation is also fault tolerant.<\/p>\n<p class=\"entry-title \"><strong>Learn: <a href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-namenode-high-availability\/\">NameNode High Availability in Hadoop HDFS<\/a><\/strong><\/p>\n<h2>4. Conclusion<\/h2>\n<p>In conclusion, This design allows HDFS to increase the\u00a0numbers of clients. This is because the data traffic spreads on all the Datanodes of clusters. It also provides <a href=\"http:\/\/data-flair.training\/blogs\/hadoop-hdfs-namenode-high-availability\/\"><strong>High Availability<\/strong><\/a>, <a href=\"http:\/\/data-flair.training\/blogs\/rack-awareness-hadoop-hdfs\/\"><strong>Rack Awareness<\/strong><\/a>, <a href=\"http:\/\/data-flair.training\/blogs\/what-is-erasure-coding-introduction-hadoop-hdfs\/\"><strong>Erasure coding<\/strong><\/a> etc, As a result, it empowers Hadoop.<br \/>\nIf you like this post or have any query about\u00a0HDFS data read and write operations, so please leave a comment. We will be happy to solve them.<br \/>\n<b>See Also-<\/b><\/p>\n<ul>\n<li><strong><a href=\"http:\/\/data-flair.training\/blogs\/features-hadoop-hdfs-overview-beginners\/\">Features of HDFS\u00a0<\/a><\/strong><\/li>\n<li><strong><a href=\"http:\/\/data-flair.training\/blogs\/hadoop-hdfs-commands-tutorial\/\">HDFS Commands Tutorial<\/a><\/strong><\/li>\n<\/ul>\n<p><strong>References:<\/strong><br \/>\n<strong><a href=\"http:\/\/www.oreilly.com\/\" target=\"_blank\" rel=\"noopener noreferrer\">Hadoop &#8211; The definitive guide<\/a><\/strong><br \/>\n<strong> <a href=\"http:\/\/hadoop.apache.org\" target=\"_blank\" rel=\"noopener noreferrer\">hadoop.apache.org<\/a><\/strong><span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:2472,&quot;href&quot;:&quot;http:\\\/\\\/www.oreilly.com&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20251211044823\\\/https:\\\/\\\/www.oreilly.com\\\/&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-11 06:56:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-15 13:17:42&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-18 23:07:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-22 20:54:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-28 13:40:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-31 22:37:25&quot;,&quot;http_code&quot;:503},{&quot;date&quot;:&quot;2026-01-07 05:38:13&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-14 00:31:49&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-20 22:18:33&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-26 05:18:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-29 06:05:25&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-02 04:03:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-05 04:13:00&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-08 13:39:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-11 18:23:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-17 06:18:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-23 05:43:58&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-26 09:49:00&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-01 21:54:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-05 07:27:57&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-09 18:16:36&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-13 03:57:21&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-18 04:22:31&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-22 19:17:31&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-25 23:29:03&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-30 15:54:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-07 05:43:37&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-10 08:22:22&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-13 08:34:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-17 05:39:48&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-21 07:24:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-24 09:05:00&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-27 15:07:23&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-05 06:15:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-10 04:40:12&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-14 18:27:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-18 05:47:00&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-21 07:24:24&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-01 05:44:43&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-05 06:31:26&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-08 12:37:48&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-13 08:06:27&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-13 08:06:27&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:1961,&quot;href&quot;:&quot;http:\\\/\\\/hadoop.apache.org&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20251008061344\\\/https:\\\/\\\/hadoop.apache.org\\\/&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-10 14:04:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-13 17:20:07&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-16 18:01:00&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-19 19:50:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-22 20:54:18&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-26 11:01:10&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2025-12-29 14:17:41&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-02 12:20:14&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-05 13:45:38&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-08 16:24:57&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-11 19:41:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-14 19:52:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-18 05:07:46&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-21 05:37:59&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-24 11:10:18&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-27 15:53:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-01-30 16:11:51&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-02 16:36:11&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-05 19:55:07&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-09 01:39:54&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-12 03:15:19&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-15 09:31:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-18 09:35:50&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-22 06:23:55&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-25 11:29:35&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-02-28 16:00:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-03 17:03:37&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-07 11:08:33&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-10 13:03:13&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-13 19:14:29&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-17 05:54:08&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-20 11:50:36&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-23 13:05:02&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-26 14:25:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-03-30 06:56:47&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-02 06:59:42&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-05 18:20:10&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-09 05:19:38&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-12 06:38:00&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-15 12:30:25&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-18 15:09:26&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-21 15:42:42&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-24 15:47:46&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-04-27 23:43:53&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-01 05:55:54&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-04 15:38:31&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-07 17:57:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-11 03:53:53&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-14 11:16:15&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-18 03:36:32&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-21 03:50:12&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-24 05:21:44&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-28 01:43:22&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-05-31 20:41:16&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-04 03:29:33&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-07 05:08:27&quot;,&quot;http_code&quot;:206},{&quot;date&quot;:&quot;2026-06-10 05:20:26&quot;,&quot;http_code&quot;:503},{&quot;date&quot;:&quot;2026-06-13 07:55:02&quot;,&quot;http_code&quot;:206}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-13 07:55:02&quot;,&quot;http_code&quot;:206},&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Objective HDFS follow Write once Read many models. So we cannot edit files already stored in HDFS, but we can append data by reopening the file. In Read-Write operation client first, interact with&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":30491,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[25],"tags":[782,1971,5201,5204,5592,5599,11356],"class_list":["post-425","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-hdfs","tag-apache-hadoop","tag-big-data-training","tag-hadoop-admin","tag-hadoop-administration","tag-hdfs-read-operation","tag-hdfs-write-operation","tag-read-and-write-operations-in-hdfs"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Hadoop HDFS Data Read and Write Operations - DataFlair<\/title>\n<meta name=\"description\" content=\"HDFS data read and write operations cover HDFS file read operation video,HDFS file write operation video,HDFS file read &amp; write process,HDFS fault Tolerance\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Hadoop HDFS Data Read and Write Operations - DataFlair\" \/>\n<meta property=\"og:description\" content=\"HDFS data read and write operations cover HDFS file read operation video,HDFS file write operation video,HDFS file read &amp; write process,HDFS fault Tolerance\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2016-06-13T13:52:45+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-08-25T17:04:13+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Hadoop HDFS Data Read and Write Operations - DataFlair","description":"HDFS data read and write operations cover HDFS file read operation video,HDFS file write operation video,HDFS file read & write process,HDFS fault Tolerance","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/","og_locale":"en_US","og_type":"article","og_title":"Hadoop HDFS Data Read and Write Operations - DataFlair","og_description":"HDFS data read and write operations cover HDFS file read operation video,HDFS file write operation video,HDFS file read & write process,HDFS fault Tolerance","og_url":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2016-06-13T13:52:45+00:00","article_modified_time":"2021-08-25T17:04:13+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Hadoop HDFS Data Read and Write Operations","datePublished":"2016-06-13T13:52:45+00:00","dateModified":"2021-08-25T17:04:13+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/"},"wordCount":1809,"commentCount":11,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg","keywords":["apache hadoop","big data training","hadoop admin","Hadoop Administration","HDFS Read Operation","HDFS Write Operation","Read and Write Operations in HDFS"],"articleSection":["HDFS Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/","url":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/","name":"Hadoop HDFS Data Read and Write Operations - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg","datePublished":"2016-06-13T13:52:45+00:00","dateModified":"2021-08-25T17:04:13+00:00","description":"HDFS data read and write operations cover HDFS file read operation video,HDFS file write operation video,HDFS file read & write process,HDFS fault Tolerance","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/06\/HDFS-Data-Read-and-Write-Operations.jpg","width":1200,"height":628,"caption":"Hadoop HDFS Data Read and Write Operations"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/hadoop-hdfs-data-read-and-write-operations\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"HDFS Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/hdfs\/"},{"@type":"ListItem","position":3,"name":"Hadoop HDFS Data Read and Write Operations"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/425","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=425"}],"version-history":[{"count":8,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/425\/revisions"}],"predecessor-version":[{"id":41960,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/425\/revisions\/41960"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/30491"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=425"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=425"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=425"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}