Is Hadoop a database?

This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 2 reply threads

Author

Posts
- September 20, 2018 at 12:11 pm #4765
  
  DataFlair Team
  Spectator
  
  Is Hadoop just another database like oracle or DB2?
- September 20, 2018 at 12:11 pm #4766
  
  DataFlair Team
  Spectator
  
  The database is something which stores structured data. Hadoop is not a database, it’s the superset of database, which can store any format of data. Hadoop ecosystem can perform:
  
  Data Storage
  Data Access
  Data Serialization
  Data Intelligence
  Data Integration
  Management, Monitoring – Orchestration
  Interaction -Visualization- execution-development
  
  Now, lets compare the difference between Hadoop’s Database and Native database.
  
  Hadoop’s HDFS:
  Hadoop stores very large amounts of structured, non-structured and semi-structured data on the HDFS in the flat file format in clusters. In HDFS data is stored reliably. Files are broken into blocks and distributed across nodes in a cluster. After that each block is replicated, means copies of blocks are created on different machines. Hence if a machine goes down or gets crashed, then also we can easily retrieve and access our data from different machines. By default 3 copies of a file are created on different machines. Hence it is highly fault tolerant. We use map-reduce to process the data in HDFS, but this doesn’t provide very fast results. Since it doesn’t support the random search.
  
  Native Database:
  This can handle only structured data and also it needs the data which is processed. When the volume of data increases, then it becomes an inefficient method to handle data. Data here is stored in tables and can be accessed using SQL.
- September 20, 2018 at 12:11 pm #4768
  
  DataFlair Team
  Spectator
  
  Hadoop is not a database storage or relational storage. It is mainly used for processing huge amounts of data on distributed servers. It stores files in HDFS (Hadoop distributed file system) but does not qualify as a relational database. Relational databases store information in tables defined by the specific schema. Hadoop can store unstructured, semi-structured and structured data while traditional databases can store only structured data. We cannot do update/modify on data in HDFS which can be done in a traditional DB.
  
  There are components like Hive which works on top of HDFS and allows users to query data stored in HDFS with SQL-like syntax called HiveQL. It internally uses MapReduce to get the results.
Author

Posts

Viewing 2 reply threads

You must be logged in to reply to this topic.

Is Hadoop a database?

About DataFlair

Trending Data Science Courses

Free Big Data Courses

Trending Programming Courses

Trending Web Dev Courses

Trending Courses

Trending Python Courses

Trending Java Courses

Trending DSA Courses