Difference between RDBMS with Hadoop MapReduce

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop Difference between RDBMS with Hadoop MapReduce

Viewing 1 reply thread
  • Author
    Posts
    • #5795
      DataFlair TeamDataFlair Team
      Spectator

      What is the difference between RDBMS with Hadoop MapReduce?
      Comparison between Hadoop MapReduce vs RDBMS?

    • #5796
      DataFlair TeamDataFlair Team
      Spectator

      Hi Animesh,
      I think your question must be reframed as “What is the difference between RDBMS and Hadoop HDFS?”.

      If you want the answer to the former question, then here you go with my answer:

      1. In simple words RDBMS is relational database management system. Database management system (DBMS) stores data in the form of tables. The structured query language is used to take out important data stored in these tables.
      2. On the other hand, Hadoop MapReduce is an algorithm developed by Google, which is mainly used in the Hadoop framework. It is used by many organizations to process huge amounts of data and use the data according to the business requirements.

      If the latter is the question which you need to know about, then below is the answer for the difference between RDBMS and Hadoop HDFS:

      1. Both RDBMS and Hadoop system have similar functions such as collecting, storing, processing, retrieving, extracting and manipulating data. However, both are different in term of processing data. The RDBMS focuses on structured data whereas the Hadoop have specialization in semi-structured, unstructured data.
      2. RDBMS database technology is a very proven, consistent, matured and highly supported by world best companies. This works better when the data is definitions such as data types, relationships among the data, constraints, etc. Hence, this is more appropriate for real-time OLTP processing. On the other hand, Hadoop system technology is developed recently and becomes in demand due to big data, unstructured data in different formats.
      3. RDMS is mostly used for OLTP processing whereas Hadoop is used for analytical and BIG DATA processing.
      4. Meanwhile, the maintenance on storage, a downtime is needed for any available RDBMS. In standalone database systems, to add processing power such as more CPU, physical memory in a non-virtualized environment, a downtime is needed for RDBMS such as DB2, Oracle, and SQL Server. However, Hadoop systems are individual independent nodes that can be added to an as needed basis.
      5. The database cluster uses the same data files stored in the shared storage in RDBMS systems. Whereas in Hadoop, the storage data can be stored independently in each processing node.
      6. The performance tuning of an RDBMS can go down, even in the proven environment. However, Hadoop enables hot tuning by adding extra nodes which will be self-managed.
Viewing 1 reply thread
  • You must be logged in to reply to this topic.