What is difference between memory channel and file channel in flume?

Free Online Certification Courses – Learn Today. Lead Tomorrow. Forums Apache Hadoop What is difference between memory channel and file channel in flume?

Viewing 1 reply thread
  • Author
    Posts
    • #4675
      DataFlair TeamDataFlair Team
      Spectator

      What is difference between memory channel and file channel in flume?
      which is better for collecting web server logs ?

    • #4676
      DataFlair TeamDataFlair Team
      Spectator

      If the Flume agent goes down, then all the flows hosted on that agent are aborted. Once the agent is restarted, then flow will resume.

      File channel
      The flow using file channel will resume processing events where it left off. If the agent can’t be restarted on the same hardware, then there is an option to migrate the database to another hardware and setup a new Flume agent that can resume processing the events saved in the db.

      Memory channel
      But in case of memory channel, above can’t be possible. In memory channel, events can’t be persisted and available as long as agent process running or live. Recovery of events in memory channel is impossible but performance is very fast and excellent compare to other channel.

      For web server logs, memory channel is better because logging is a continuous process as long as web server is running. If we choose file channel, local disk space will keep on increasing and eventually would be a bottleneck for overall system performance where flume is running. Ideally for web server logging, we don’t need to persist event due to continuous generation of server logs.

Viewing 1 reply thread
  • You must be logged in to reply to this topic.