MapReduce Tutorials

Top 60 Hadoop MapReduce Interview Questions and Answers

Hadoop MapReduce Interview Questions and Answers: Objective A Hadoop MapReduce is a software framework for easily writing Application for processing a large amount of data in parallel or on a large cluster of a commodity. As it deals with processing of data it is likely to be asked in Hadoop MapReduce Interview Questions and Answers. So in this section, we covered 60 MapReduce interview questions and answers framed by our company expert. These Hadoop Mapreduce Interview questions and Answers are […]

Learn How Hadoop MapReduce works internally?

How Hadoop MapReduce Works – MapReduce Tutorial 2

1. Objective MapReduce is the core component of Hadoop that process huge amount of data in parallel by dividing the work into a set of independent tasks. In MapReduce data flow in step by step from Mapper to Reducer. In this tutorial, we are going to cover how Hadoop MapReduce works internally? This blog on Hadoop MapReduce data flow will provide you the complete MapReduce data flow chart in Hadoop. The tutorial covers various phases of MapReduce job execution such as Input Files, […]

Hadoop Counters | The Most Complete Guide to MapReduce Counters

1. Hadoop Counters: Objective In this MapReduce Hadoop Counters tutorial, we will provide you the detailed description of MapReduce Counters in Hadoop. The tutorial covers an introduction to Hadoop MapReduce counters, Types of Hadoop Counters such as Built-in Counters and User-defined counters. In this Hadoop counters tutorial, we will also discuss the FileInputFormat and FileOutputFormat of Hadoop MapReduce. 2. What is Hadoop MapReduce? Before we start with Hadoop Counters, let us first see the overview of Hadoop MapReduce. MapReduce is […]

Types of Counters in Hadoop MapReduce

The concepts of key-value pair in Hadoop MapReduce.

Learn the Concept of Key-Value Pair in Hadoop MapReduce

1. Objective In this MapReduce tutorial, we are going to learn the concept of a key-value pair in Hadoop. The key Value pair is the record entity that MapReduce job receives for execution. By default, RecordReader uses TextInputFormat for converting data into a key-value pair. Here we will learn what is a key-value pair in MapReduce, how key-value pairs are generated in Hadoop using InputSplit and RecordReader and on what basis generation of key-value pairs in Hadoop MapReduce takes place? We […]

Hadoop Output Format – Types of Output Format in Mapreduce 1

1. Hadoop Output Format – Objective The Hadoop Output Format checks the Output-Specification of the job. It determines how RecordWriter implementation is used to write output to output files. In this blog, we are going to see what is Hadoop Output Format, what is Hadoop RecordWriter, how RecordWriter is used in Hadoop? In this Hadoop Reducer Output Format guide, will also discuss various types of Output Format in Hadoop like textOutputFormat, sequenceFileOutputFormat, mapFileOutputFormat, sequenceFileAsBinaryOutputFormat, DBOutputFormat, LazyOutputForma, and MultipleOutputs. Where ever you face […]

Different types of OutputFormats in Hadoop MapReduce

Introduction to Map Only Job in Hadoop MapReduce.

Map Only Job in Hadoop MapReduce with example

1. Objective In Hadoop, Map-Only job is the process in which mapper does all task, no task is done by the reducer and mapper’s output is the final output. In this tutorial on Map only job in Hadoop MapReduce, we will learn about MapReduce process, the need of map only job in Hadoop, how to set a number of reducers to 0 for Hadoop map only job. We will also learn what are the advantages of Map Only job in Hadoop […]

Shuffling and Sorting in Hadoop MapReduce 2

1. Objective In Hadoop, the process by which intermediate output from mappers is transferred to the reducer is called Shuffling. Reducer gets 1 or more keys and associated values on the basis of reducers. Intermediated key-value generated by mapper is sorted automatically by key. In this blog, we will discuss in detail about shuffling and Sorting in Hadoop MapReduce. Here we will learn what is sorting in Hadoop, what is shuffling in Hadoop, what is the purpose of Shuffling and sorting […]

Shuffling & Sorting in Hadoop

Partitioner in Hadoop

Hadoop Partitioner – Internals of MapReduce Partitioner 3

1. Hadoop Partitioner / MapReduce Partitioner In this MapReduce Tutorial, our onjective is to discuss what is Hadoop Partitioner. The Partitioner in MapReduce controls the partitioning of the key of the intermediate mapper output. By hash function, key (or a subset of the key) is used to derive the partition. A total number of partitions depends on the number of reduce task. Here we will also learn what is the need of Hadoop partitioner, what is the default Hadoop partitioner, how many […]

InputSplit in Hadoop MapReduce – Hadoop MapReduce Tutorial 2

1. Objective In this Hadoop MapReduce tutorial, we will provide you the detailed description of InputSplit in Hadoop. In this blog, we will try to answer What is Hadoop InputSplit, what is the need of inputSplit in MapReduce and how Hadoop performs InputSplit, How to change split size in Hadoop. We will also learn the difference between InputSplit vs Blocks in HDFS. 2. What is InputSplit in Hadoop? InputSplit in Hadoop MapReduce is the logical representation of data. It describes […]

Introduction to InputSplit in Hadoop MapReduce.

Introduction to Hadoop RecordReader and its types.

Hadoop RecordReader – How RecordReder Works in Hadoop? 2

1. Hadoop RecordReader Tutorial – Objective In this Hadoop RecordReader Tutorial, We are going to discuss the important concept of Hadoop MapReduce i.e. RecordReader. The MapReduce RecordReader in Hadoop takes the byte-oriented view of input, provided by the InputSplit and presents as a record-oriented view for Mapper. It uses the data within the boundaries that were created by the InputSplit and creates Key-value pair. This blog will answer what is RecordReader in Hadoop, how Hadoop RecordReader works and types of Hadoop […]