What is the need of Flume?

Viewing 1 reply thread
  • Author
    Posts
    • #4740
      DataFlair TeamDataFlair Team
      Spectator

      Why was Flume needed?

    • #4743
      DataFlair TeamDataFlair Team
      Spectator

      Why Flume?

      Let’s say there is a company which has tons of services running on multiple servers as well as produces lots of data (logs) , here we are required to analyze them altogether. So, we need a reliable, scalable, extensible and manageable distributed data collection service, in order process that logs, especially, that can perform flow of unstructured data (logs) from one location to another where they will be processed (say in HDFS). Well, in simple words, an open source data collection service for moving the data from source to destination is what we call Apache Flume.

      In other words, we can say an available service for aggregating, systematically collecting, or moving large amounts of streaming data (logs) into the Hadoop Distributed File System (HDFS), is what we call Apache Flume. It has a simple and flexible architecture, on the basis of streaming data flows. Moreover, for fail-over and recovery, it is highly fault-tolerant and robust as well as with tunable reliability mechanisms. In addition, it permits data collection in batch and streaming mode as well.

      Learn more about Apache Flume, follow the link: Apache Flume Tutorial – Flume Introduction, Features & Architecture

Viewing 1 reply thread
  • You must be logged in to reply to this topic.