Free Online Certification Courses – Learn Today. Lead Tomorrow. › Forums › Apache Hadoop › What is the need of Flume?
- This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.
-
AuthorPosts
-
-
September 20, 2018 at 12:06 pm #4740DataFlair TeamSpectator
Why was Flume needed?
-
September 20, 2018 at 12:06 pm #4743DataFlair TeamSpectator
Why Flume?
Let’s say there is a company which has tons of services running on multiple servers as well as produces lots of data (logs) , here we are required to analyze them altogether. So, we need a reliable, scalable, extensible and manageable distributed data collection service, in order process that logs, especially, that can perform flow of unstructured data (logs) from one location to another where they will be processed (say in HDFS). Well, in simple words, an open source data collection service for moving the data from source to destination is what we call Apache Flume.
In other words, we can say an available service for aggregating, systematically collecting, or moving large amounts of streaming data (logs) into the Hadoop Distributed File System (HDFS), is what we call Apache Flume. It has a simple and flexible architecture, on the basis of streaming data flows. Moreover, for fail-over and recovery, it is highly fault-tolerant and robust as well as with tunable reliability mechanisms. In addition, it permits data collection in batch and streaming mode as well.
Learn more about Apache Flume, follow the link: Apache Flume Tutorial – Flume Introduction, Features & Architecture
-
-
AuthorPosts
- You must be logged in to reply to this topic.