What are the different types of OutputFormat in MapReduce?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 5:21 pm #6204
  
  DataFlair Team
  Spectator
  
  What are the most common OutputFormat in Hadoop?
  How many types of OutputFormat is there in Hadoop?
- September 20, 2018 at 5:21 pm #6206
  
  DataFlair Team
  Spectator
  
  Hadoop Recordwriter takes output data from Reducer and writes this data to output files.the method of these output key-value pairs are written in output files by record writer is determined by the output format.
  The OutputFormat and InputFormat are similar. OutputFormat cases provided by hadoop are used to write files on the local disk or on hadoop file system. Output format defines the output requirements of the MapReduce job. Simple requirements are,
  1. It checks that the output directory does not already exist.
  2. Outputformat provides the record writer implementation to be used to write out files of the job.
  3. Output files stored in hadoop file system.
  • Fileoutputformat.setoutputpath() to set output directory
  Types:
  • Textoutputformat
  • Sequencefileoutputformat
  • Map fileoutputformat
  • Multiplroutputs
  • Lazyoutputformat
  • Dboutputformat
  • Sequencefileasbinaryoutput format
  
  Textoutputformat: is Default output format.it writes key value pairs on individual lines of text files.
  Each key value pair is separated by a tab character.
  • Property :MapReduce.output,textoutputformat.separator properly
  • KeyvalueTextoutformst used for reading these output text file .meanwhile it breaks lines into key value pairs based on a configurable separator.
  SequenceFileOutputFormat is an OutputFormat which writes sequences files for its output and its intermediate format use between MapReduce jobs.
  • Which rapidly serialize random data types to the file, and the corresponding sequencefileformat will deserialize the file into same data types and presents the data to the next mapper.
  
  MapFileOutputFormat
  • MapFileOutputFormat is another form of FileOutputFormat, which is used to write output as map files. The key in a MapFile must be added in order, we need to confirm that reducer emits keys in sorted order.
  
  MultipleOutputs allows writing data to files whose names are resulting from the output keys and values,
  or in the statement from a random string.
  
  LazyOutputFormat Sometimes FileOutputFormat will create output files, even if they are empty.
  
  • LazyOutputFormat is a wrapper OutputFormat which ensures that the output file will be created only when the record is emitted for a given screen.
  
  DBOutputFormat is an OutputFormat for writing to relational databases and HBase. It sends the reduce output to a SQL table.
  • It accepts key-value pairs, where the key has a type extending DBwritable.
  • Returned RecordWriter writes only the key to the database with a batch SQL query.
  
  Follow the link for more detail: OutputFormat in Hadoop
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.

What are the different types of OutputFormat in MapReduce?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses