If you loaded data again in Hive then what will happen?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop If you loaded data again in Hive then what will happen?

Viewing 2 reply threads
  • Author
    Posts
    • #5279
      DataFlair TeamDataFlair Team
      Spectator

      if you loaded data from RDBMS by sqoop and by some reason you loaded again second time then what will happen?

    • #5281
      DataFlair TeamDataFlair Team
      Spectator

      As we are giving the same command again, it will be errored out with error “Output directory hdfs://localhost:9000/sqoop2 already exists” . We need a new directory everytime we run the command as it’s output is generated by map-reduced job, which needs a new output directory everytime we run the job.

    • #5282
      DataFlair TeamDataFlair Team
      Spectator

      If for some reason the same command for load is run again, it will fail stating that the directory already exists, because by default, imports go to a new target location. If the destination directory already exists in HDFS, Sqoop will refuse to import and overwrite that directory’s contents. If you use the –append argument, Sqoop will import data to a temporary directory and then rename the files into the normal target directory in a manner that does not conflict with existing filenames in that directory.

Viewing 2 reply threads
  • You must be logged in to reply to this topic.