If you loaded data again in Hive then what will happen?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 2:45 pm #5279
  
  DataFlair Team
  Spectator
  
  if you loaded data from RDBMS by sqoop and by some reason you loaded again second time then what will happen?
- September 20, 2018 at 2:45 pm #5281
  
  DataFlair Team
  Spectator
  
  As we are giving the same command again, it will be errored out with error “Output directory hdfs://localhost:9000/sqoop2 already exists” . We need a new directory everytime we run the command as it’s output is generated by map-reduced job, which needs a new output directory everytime we run the job.
- September 20, 2018 at 2:45 pm #5282
  
  DataFlair Team
  Spectator
  
  If for some reason the same command for load is run again, it will fail stating that the directory already exists, because by default, imports go to a new target location. If the destination directory already exists in HDFS, Sqoop will refuse to import and overwrite that directory’s contents. If you use the –append argument, Sqoop will import data to a temporary directory and then rename the files into the normal target directory in a manner that does not conflict with existing filenames in that directory.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.