Flume Event Serializers – Apache Flume

1. Objective

While it comes to convert a Flume event into another format for output, we use an event serializer mechanism. However, there are many more insights we can learn about Event Serializers in Apache Flume. In this article, we will focus on the concept of Flume Event Serializers. Also, we will see several types of Flume Event Serializers to understand this topic well. Moreover, we will cover Event Serializers in Flume with examples.

Apache Flume Event Serializers - Introduction

Apache Flume Event Serializers – Introduction

2. Introduction to Flume Event Serializers

While it comes to convert a Flume event into another format for output, we use Apache Flume event serializer mechanism. Moreover, we can say it is very similar in function to the Layout class in log4j. Although, the text serializer, which outputs just the Flume event body, by default. Unlike that, we have another, header_and_text, which outputs both the headers and the body. Ultimately, to create an Avro representation of the event there is an avro_event serializer that we can use.
In addition, both file_roll sink and the HDFS sink support the EventSerializer interface. Also, see below details of the EventSerializers that ship with Flume.
Read about Apache Flume Architecture & Apache Flume Features

Hadoop Quiz
If these professionals can make a switch to Big Data, so can you:
Rahul Doddamani Story - DataFlair
Rahul Doddamani
Java → Big Data Consultant, JDA
Follow on
Mritunjay Singh Success Story - DataFlair
Mritunjay Singh
PeopleSoft → Big Data Architect, Hexaware
Follow on
Rahul Doddamani Success Story - DataFlair
Rahul Doddamani
Big Data Consultant, JDA
Follow on
I got placed, scored 100% hike, and transformed my career with DataFlair
Enroll now
Deepika Khadri Success Story - DataFlair
Deepika Khadri
SQL → Big Data Engineer, IBM
Follow on
DataFlair Web Services
You could be next!
Enroll now

3. Types of Flume Event Serializers

a. Body Text Serializer

Alias: text.
Basically, without any transformation or modification Body Text Serializer interceptor writes the body of the event to an output stream. Also, here all the event headers are ignored. See below Body Text Flume Event Serializers, Configuration options:

Property NameDefaultDescription
appendNewlinetrueWhether a newline will be appended to each event at write time. The default of true assumes that events do not contain newlines, for legacy reasons.
Whether a newline will be appended to each event at write time. The default of true assumes that events do not contain newlines, for legacy reasons.

Also, see an example for agent named a1:
a1.sinks = k1
a1.sinks.k1.type = file_roll
a1.sinks.k1.channel = c1
a1.sinks.k1.sink.directory = /var/log/flume
a1.sinks.k1.sink.serializer = text
a1.sinks.k1.sink.serializer.appendNewline = false
Read about Flume Sink & Flume Source

b. “Flume Event” Avro Event Serializer

Alias: avro_event.
However, “Flume Event” Avro Event Serializer interceptor serializes Flume events into an Avro container file. Although, this schema which we use is as same as the schema we use for Flume events in the Avro RPC mechanism.
In addition, from the AbstractAvroEventSerializer class, this serializer inherits.
Thus, see Avro Flume Event Serializers, Configuration options below:

Join DataFlair on Telegram
Property NameDefaultDescription
syncIntervalBytes2048000Avro sync interval, in approximate bytes.
compressionCodecnullAvro compression codec. For supported codecs, see Avro’s CodecFactory docs.

So, let’s see an example for agent named a1:
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.serializer = avro_event
a1.sinks.k1.serializer.compressionCodec = snappy
Let’s revise Flume Data Flow – Types & Failure in detail

c. Avro Event Serializer

Alias:
Though Avro Event Serializer does not have an alias. So, it must be specified using the fully-qualified class name class name.
Even though the record schema is configurable, it serializes Flume events into an Avro container file like the “Flume Event” Avro Event Serializer. Moreover, there are two possible ways in which we can specify the record schema. Either as a Flume configuration property or passed in an event header.
Moreover, by using the property schemaURL as listed below we can pass the record schema as part of the Flume configuration.
Also, make sure that to pass the record schema in an event header again we have two ways. Either specify the event header flume.avro.schema.literal containing a JSON-format representation of the schema or flume.avro.schema.url with a URL where the schema may be found (hdfs:/… URIs are supported).
Let’s look at Flume Sink processorFlume Channel Selectors in detail
Also, note that this serializer inherits from the AbstractAvroEventSerializer class.
See Avro Flume Event Serializer, Configuration options below:

Property NameDefaultDescription
syncIntervalBytes2048000Avro sync interval, in approximate bytes.
compressionCodecnullAvro compression codec. For supported codecs, see Avro’s CodecFactory docs.
schemaURLnullAvro schema URL. Schemas specified in the header override this option.

Also, see an example for agent named a1:
a1.sinks.k1.type = hdfs
a1.sinks.k1.channel = c1
a1.sinks.k1.hdfs.path = /flume/events/%y-%m-%d/%H%M/%S
a1.sinks.k1.serializer = org.apache.flume.sink.hdfs.AvroEventSerializer$Builder
a1.sinks.k1.serializer.compressionCodec = snappy
a1.sinks.k1.serializer.schemaURL = hdfs://namenode/path/to/schema.avsc
Let’s look at Flume Channel  & Flume installation

4. Conclusion

As a result, we have studied the complete content of Flume Event Serializers, Types of Flume Event Serializers, Body Text Serializer, Flume Event Avro Event Serializer, and Avro Event Serializer. Instead, feel free to ask possible doubts. Hence, you can use comment section for that.
See Also- Apache Flume Interceptors and Its Types
For reference

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.