Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Explain Map-side joins?
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 4:55 pm #5999DataFlair TeamSpectator
Explain Map-side joins in Hadoop?
Discuss what is Map-side joins in MapReduce in detail? -
September 20, 2018 at 4:55 pm #6001DataFlair TeamSpectator
Join is a operation where we combine two or more datasets based on column or a set of columns.
In mapreduce if Joins performed by mapper then is called as map-side joins and
if Joins performed by reducer then it is called as reduce-side joins.A map-side join between large inputs works by performing the join on the data and after that it
reaches the map function.Map side join is more efficient to reduce side.
Now lets understand with the help of example:-
Suppose we have two datasets
DS-1(Employees Working on Projects)
ProjectID EmpID
101 E-1
101 E-2
102 E-3
102 E-4DS-2(Project Details)
ProjectID ProjectName
101 P1
102 P2Now let assume we want to combine these datasets on the basis of Projectid and see all the project details and Employee details combined together like
ProjectID ProjectName EmpID
101 P1 E-1
101 P1 E-2
102 P2 E-3
102 P2 E-4Now in map side join map operation will produce output result as :-
MAP 101 P1 1
102 P2 2MAP 103 P3 3
104 P4 4
And input data to the Map will be in following form so as to produce shown output result(above)101 P1
101 E-1 MAP
101 E-2102 P2
102 E-3 MAP
102 E-4Now we can infer the strict requirements which should be considered for Map-SideJoins
-
<li style=”list-style-type: none”>
- all the input datasets should be sorted by the same key that should be the one based on which join is performed.In above ex. it is ProjectID.
- Each input datasets must be divided into the same number of partitions.
- All the records of the same key should be in the same partition.
This map output will be used as input to reducer.
-
-
AuthorPosts
- You must be logged in to reply to this topic.