Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › Is Hadoop a database?
- This topic has 2 replies, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 12:11 pm #4765DataFlair TeamSpectator
Is Hadoop just another database like oracle or DB2?
-
September 20, 2018 at 12:11 pm #4766DataFlair TeamSpectator
The database is something which stores structured data. Hadoop is not a database, it’s the superset of database, which can store any format of data. Hadoop ecosystem can perform:
Data Storage
Data Access
Data Serialization
Data Intelligence
Data Integration
Management, Monitoring – Orchestration
Interaction -Visualization- execution-developmentNow, lets compare the difference between Hadoop’s Database and Native database.
Hadoop’s HDFS:
Hadoop stores very large amounts of structured, non-structured and semi-structured data on the HDFS in the flat file format in clusters. In HDFS data is stored reliably. Files are broken into blocks and distributed across nodes in a cluster. After that each block is replicated, means copies of blocks are created on different machines. Hence if a machine goes down or gets crashed, then also we can easily retrieve and access our data from different machines. By default 3 copies of a file are created on different machines. Hence it is highly fault tolerant. We use map-reduce to process the data in HDFS, but this doesn’t provide very fast results. Since it doesn’t support the random search.Native Database:
This can handle only structured data and also it needs the data which is processed. When the volume of data increases, then it becomes an inefficient method to handle data. Data here is stored in tables and can be accessed using SQL. -
September 20, 2018 at 12:11 pm #4768DataFlair TeamSpectator
Hadoop is not a database storage or relational storage. It is mainly used for processing huge amounts of data on distributed servers. It stores files in HDFS (Hadoop distributed file system) but does not qualify as a relational database. Relational databases store information in tables defined by the specific schema. Hadoop can store unstructured, semi-structured and structured data while traditional databases can store only structured data. We cannot do update/modify on data in HDFS which can be done in a traditional DB.
There are components like Hive which works on top of HDFS and allows users to query data stored in HDFS with SQL-like syntax called HiveQL. It internally uses MapReduce to get the results.
-
-
AuthorPosts
- You must be logged in to reply to this topic.