What is difference between memory channel and file channel in flume?

This topic has 1 reply, 1 voice, and was last updated 5 years, 7 months ago by DataFlair Team.

Viewing 1 reply thread

Author

Posts
- September 20, 2018 at 11:39 am #4675
  
  DataFlair Team
  Spectator
  
  What is difference between memory channel and file channel in flume?
  which is better for collecting web server logs ?
- September 20, 2018 at 11:39 am #4676
  
  DataFlair Team
  Spectator
  
  If the Flume agent goes down, then all the flows hosted on that agent are aborted. Once the agent is restarted, then flow will resume.
  
  File channel
  The flow using file channel will resume processing events where it left off. If the agent can’t be restarted on the same hardware, then there is an option to migrate the database to another hardware and setup a new Flume agent that can resume processing the events saved in the db.
  
  Memory channel
  But in case of memory channel, above can’t be possible. In memory channel, events can’t be persisted and available as long as agent process running or live. Recovery of events in memory channel is impossible but performance is very fast and excellent compare to other channel.
  
  For web server logs, memory channel is better because logging is a continuous process as long as web server is running. If we choose file channel, local disk space will keep on increasing and eventually would be a bottleneck for overall system performance where flume is running. Ideally for web server logging, we don’t need to persist event due to continuous generation of server logs.
Author

Posts

Viewing 1 reply thread

You must be logged in to reply to this topic.