Real Time Problem on Hadoop Cluster

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 4:53 pm #5985
  
  DataFlair Team
  Spectator
  
  What are the real time problems faced during project development on Hadoop Cluster ?
- September 20, 2018 at 4:53 pm #5988
  
  DataFlair Team
  Spectator
  
  Analyzing tb’s of data can be cumbersome since most of the data is in an unstructured format. Also, not all data fits is rows and columns. Since data comes in multiple formats and changes from time to time processing it takes time. Hadoop is a read-only system. Once data has been written into the HDFS, users can’t easily insert, delete or modify individual pieces of data stored in the file system.
  
  The Impala and Hawq provide interfaces which enable the end users to write queries in the SQL programming language. The queries then translated into MapReduce for execution on a Hadoop cluster. However, that process is slower than running a SQL query directly against a DBMS.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.