

{"id":9158,"date":"2018-02-23T10:27:01","date_gmt":"2018-02-23T10:27:01","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=9158"},"modified":"2021-05-09T13:08:26","modified_gmt":"2021-05-09T07:38:26","slug":"flume-interceptors","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/","title":{"rendered":"Apache Flume Interceptors | Types of Interceptors in Flume"},"content":{"rendered":"<p><span style=\"font-weight: 400\">In this Apache Flume Tutorial, we talk about Apache Flume interceptors. Interceptors in Flume are those who have the capability to modify\/drop events in-flight. So, in this blog, we will learn the whole concept of <strong>Apache Flume<\/strong> interceptors. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Also, we will see several types of Interceptors in Flume: Host Flume Interceptors, Morphline Interceptor, Flume Interceptors Regex Extractor, Regex Filtering Interceptor, Remove Header Interceptor, Search and Replace Interceptor, Static Interceptor, Timestamp Interceptors, and UUID Interceptor to understand this topic well. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, we will cover interceptors in Apache Flume examples and search the answer to this question, how to add interceptor in Flume to learn it more clearly.<\/span><\/p>\n<h2><span style=\"font-weight: 400\">What is Flume Interceptors<\/span><\/h2>\n<p><span style=\"font-weight: 400\">Basically, we can modify\/drop events in-flight with the help of Apache Flume. It has the capability. So, this process takes place with the help of interceptors in Flume. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, they are the classes that implement org.apache.flume.interceptor.Interceptor interface. Also, can modify or even drop events based on any criteria chosen by the developer of the interceptor.<\/span><\/p>\n<p>In addition, Apache Flume supports chaining of interceptors. It is only possible through by specifying the list of interceptor builder class names in the configuration. Although, in the source configuration Flume interceptors are specified as a whitespace separated list.<\/p>\n<p>However, the order in which they are invoked, is the order in which the interceptors are specified. They are named components. So let\u2019s see an example of how we create Flume Interceptors through configuration:<\/p>\n<p><span style=\"font-weight: 400\">For example,<\/span><br \/>\n<b>a1.sources = r1<\/b><br \/>\n<b>a1.sinks = k1<\/b><br \/>\n<b>a1.channels = c1<\/b><br \/>\n<b>a1.sources.r1.interceptors = i1 i2<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.type = org.apache.flume.interceptor.HostInterceptor$Builder<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.preserveExisting = false<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.hostHeader = hostname<\/b><br \/>\n<b>a1.sources.r1.interceptors.i2.type = org.apache.flume.interceptor.TimestampInterceptor$Builder<\/b><br \/>\n<b>a1.sinks.k1.filePrefix = FlumeData.%{CollectorHost}.%Y-%m-%d<\/b><br \/>\n<b>a1.sinks.k1.channel = c1<\/b><\/p>\n<p><span style=\"font-weight: 400\">It is very important to note that Flume interceptor builders are passed to the type config parameter.<\/span><\/p>\n<h2>Types of Interceptors in Flume<\/h2>\n<p>There are 9 types of Flume Interceptors, let&#8217;s discuss them one by one:<\/p>\n<div id=\"attachment_9167\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9167\" class=\"wp-image-9167 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01.jpg\" alt=\"Flume Interceptors\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Types-of-Flume-Interceptors-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-9167\" class=\"wp-caption-text\">Types of Flume Interceptors<\/p><\/div>\n<h3><span style=\"font-weight: 400\">i. Timestamp: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">While it comes to Timestamp Flume interceptor, it inserts into the event headers, the time in millis at which it processes the event. Moreover, we can say it inserts a header with the key timestamp or as specified by the header property, whose value is the relevant timestamp.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> Also, make sure if it is already present in the configuration, this interceptor can preserve an existing timestamp. The below table shows the property name and description of Timestamp Flume Interceptors.<\/span><br \/>\nTable.1 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be timestamp or the FQCN<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">header<\/span> <span style=\"font-weight: 400\">timestamp<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The name of the header in which to place the generated timestamp.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">preserveExisting<\/span><\/td>\n<td><span style=\"font-weight: 400\">false<\/span><\/td>\n<td><span style=\"font-weight: 400\">If the timestamp already exists, should it be preserved &#8211; true or false<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>So, let\u2019s see an example for agent a1:<br \/>\n<b>a1.sources = r1<\/b><br \/>\n<b>a1.channels = c1<\/b><br \/>\n<b>a1.sources.r1.channels = \u00a0c1<\/b><br \/>\n<b>a1.sources.r1.type = seq<\/b><br \/>\n<b>a1.sources.r1.interceptors = i1<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.type = timestamp<\/b><\/p>\n<h3><span style=\"font-weight: 400\">ii. Host<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Basically, it inserts the hostname or IP address of the host that this agent is running on. Moreover, with the key host or a configured key (whose value is the hostname or IP address of the host) on the basis of configuration, it inserts a header. <\/span><\/p>\n<p><span style=\"font-weight: 400\">The Below table shows the property name and description of the property of Host Flume Interceptors.<\/span><br \/>\nTable.2 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be host<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">preserveExisting<\/span><\/td>\n<td><span style=\"font-weight: 400\">false<\/span><\/td>\n<td><span style=\"font-weight: 400\">If the host header already exists, should it be preserved &#8211; true or false<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">useIP<\/span><\/td>\n<td><span style=\"font-weight: 400\">true<\/span><\/td>\n<td><span style=\"font-weight: 400\">Use the IP Address if true, else use hostname.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">hostHeader<\/span><\/td>\n<td><span style=\"font-weight: 400\">host<\/span><\/td>\n<td><span style=\"font-weight: 400\">The header key to be used.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400\">Also, see an example for agent a1:<\/span><br \/>\n<b>a1.sources = r1<\/b><br \/>\n<b>a1.channels = c1<\/b><br \/>\n<b>a1.sources.r1.interceptors = i1<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.type = host<\/b><\/p>\n<h3><span style=\"font-weight: 400\">iii. Static<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">While it comes to append a static header with static value to all events, it is possible with the Static Flume interceptors.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Below table shows the property name and description of the property of Static Flume Interceptors.<\/span><br \/>\nTable.3 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be static<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">preserveExisting<\/span><\/td>\n<td><span style=\"font-weight: 400\">true<\/span><\/td>\n<td><span style=\"font-weight: 400\">If configured header already exists, should it be preserved &#8211; true or false<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">key<\/span><\/td>\n<td><span style=\"font-weight: 400\">key<\/span><\/td>\n<td><span style=\"font-weight: 400\">Name of the header that should be created<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">value<\/span><\/td>\n<td><span style=\"font-weight: 400\">value<\/span><\/td>\n<td><span style=\"font-weight: 400\">Static value that should be created<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>So, let\u2019s see an example for agent a1:<br \/>\n<b>a1.sources = r1<\/b><br \/>\n<b>a1.channels = c1<\/b><br \/>\n<b>a1.sources.r1.channels = \u00a0c1<\/b><br \/>\n<b>a1.sources.r1.type = seq<\/b><br \/>\n<b>a1.sources.r1.interceptors = i1<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.type = static<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.key = datacenter<\/b><br \/>\n<b>a1.sources.r1.interceptors.i1.value = NEW_YORK<\/b><br \/>\n<b><\/b><\/p>\n<p><b>Note: <\/b><span style=\"font-weight: 400\">It does not allow specifying multiple headers at one time in its current implementation. Despite that, as a user, we can chain multiple static interceptors each defining one static header.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">iv. Remove Header<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">However, by removing one or many headers, this interceptor manipulates <strong>Flume event<\/strong> headers. By these Flume interceptors, We can remove a statically defined header. Either header based on a regular expression or headers in a list. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Although, make sure the Flume events are not modified, if none of these is defined, or if no header matches the criteria.\u00a0<\/span><span style=\"font-weight: 400\">Below table shows the property name and description of the property of Remove Header Flume Interceptors.<\/span><\/p>\n<p>Table.4 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td>type<\/td>\n<td><\/td>\n<td>The component type name has to be remove_header<\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">withName<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">Name of the header to remove<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">fromList<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">List of headers to remove, separated with the separator specified by fromListSeparator<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">fromListSeparator<\/span><\/td>\n<td><span style=\"font-weight: 400\">\\s*,\\s*<\/span><\/td>\n<td><span style=\"font-weight: 400\">Regular expression used to separate multiple header names in the list specified by fromList. Default is a comma surrounded by any number of whitespace characters<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">matching<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">All the headers which names match this regular expression are removed<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><b>Note:<\/b><span style=\"font-weight: 400\"> \u00a0Since, we need to remove only one header, specifying it by name provides performance benefits over the other 2 methods.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">v. UUID<\/span><span style=\"font-weight: 400\">:\u00a0Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">With the use of UUID Flume interceptors, we generally set a universally unique identifier on all events those are intercepted. Let\u2019s see an example, \u00a0UUID is b5755073-77a9-43c1-8fad-b7a586fc1b97, that represents a 128-bit value.<\/span><\/p>\n<p><span style=\"font-weight: 400\">In addition, to automatically assign a UUID to an event consider using UUIDInterceptor. Since no application level, the unique key for the event is available. As soon as they enter the Flume network; that is, in the first <strong>Flume Source<\/strong> of the flow, it is important to assign UUIDs to events. <\/span><\/p>\n<p><span style=\"font-weight: 400\">In the face of replication and redelivery in a Flume network, this enables subsequent deduplication of events that are designed for high availability and high performance. Moreover, this is preferable over an auto-generated UUID if an application level key is available. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Since it enables subsequent updates and deletes of the event in data stores using said well-known application level key.\u00a0<\/span><span style=\"font-weight: 400\">Below table shows the property name and description of the property of UUID Flume Interceptors.<\/span><\/p>\n<p>Table.5 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be <\/span><b>org.apache.flume.sink.solr.morphline.UUIDInterceptor$Builder<\/b><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">headerName<\/span><\/td>\n<td><span style=\"font-weight: 400\">id<\/span><\/td>\n<td><span style=\"font-weight: 400\">The name of the Flume header to modify<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">preserveExisting<\/span><\/td>\n<td><span style=\"font-weight: 400\">true<\/span><\/td>\n<td><span style=\"font-weight: 400\">If the UUID header already exists, should it be preserved &#8211; true or false<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">prefix<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The prefix string constant to prepend to each generated UUID<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><span style=\"font-weight: 400\">vi. Morphline<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">While it comes to filters the events through a morphline configuration file we use Morphline Interceptor. Basically, that defines a chain of transformation commands that pipe records from one command to another.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Let, understand this with an example, via regular expression based pattern matching the morphline can ignore certain events or alter or insert certain event headers. Also, \u00a0via Apache Tika on events that are intercepted, it can auto-detect and set a MIME type.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> For example, in a Flume topology, we can use this kind of packet sniffing for content-based dynamic routing. Moreover, we can also use MorphlineInterceptor to implement dynamic routing for multiple Apache Solr collections. Such as multi-tenancy.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Below table shows the property name and description of the property of Morphline Flume Interceptors.<\/span><br \/>\nTable.6 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name\u00a0 \u00a0\u00a0<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">morphlineFile<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The relative or absolute path on the local file system to the morphline configuration file. Example: \/etc\/flume-ng\/conf\/morphline.conf<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">morphlineId<\/span><\/td>\n<td><span style=\"font-weight: 400\">null<\/span><\/td>\n<td><span style=\"font-weight: 400\">Optional name used to identify a morphline if there are multiple morphlines in a morphline config file<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p>Sample flume.conf file:<br \/>\n<b><\/b><\/p>\n<p><b>a1.sources.avroSrc.interceptors = morphlineinterceptor<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.morphlineinterceptor.type= org.apache.flume.sink.solr.morphline.MorphlineInterceptor$Builder<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.morphlineinterceptor.morphlineFile= \/etc\/flume-ng\/conf\/morphline.conf<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.morphlineinterceptor.morphlineId = morphline1<\/b><br \/>\n<b><\/b><\/p>\n<p><b>Note:<\/b><span style=\"font-weight: 400\"> However, the morphline of an interceptor must not generate more than one output record for each input event, currently there is a restriction in that. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, we can not use these Flume interceptors for heavy duty ETL processing. Though, if you need this consider moving ETL processing from the Flume Source to a <strong>Flume Sink<\/strong>. Such as to a MorphlineSolrSink.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">vii. Search and Replace<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">However, on the basis of <strong>Java regular expressions<\/strong>, it offers simple string-based search-and-replace functionality. Also, Backtracking\/group capture is available. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, there are same rules used by this interceptor as in the Java Matcher.replaceAll() method. The below table shows Property name and description of Search and replace Flume interceptors.<\/span><br \/>\nTable.7 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><span style=\"font-weight: 400\">Property Name<\/span><\/td>\n<td><span style=\"font-weight: 400\">Default<\/span><\/td>\n<td><span style=\"font-weight: 400\">Description<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be search_replace<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">searchPattern<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The pattern to search for and replace.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">replaceString<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The replacement string.<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">charset<\/span> <span style=\"font-weight: 400\">UTF-8<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The charset of the event body. Assumed by default to be UTF-8.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<p><span style=\"font-weight: 400\">So,let\u2019s read an example configuration:<\/span><br \/>\n<b><\/b><\/p>\n<p><b>a1.sources.avroSrc.interceptors = search-replace<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.type = search_replace<\/b><br \/>\n<b># Remove leading alphanumeric characters in an event body.<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.searchPattern = ^[A-Za-z0-9_]+<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.replaceString =<\/b><br \/>\n<b><\/b><\/p>\n<p><b>Another example:<\/b><br \/>\n<b><\/b><\/p>\n<p><b>a1.sources.avroSrc.interceptors = search-replace<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.type = search_replace<\/b><br \/>\n<b># Use grouping operators to reorder and munge words on a line.<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.searchPattern = The quick brown ([a-z]+) jumped over the lazy ([a-z]+)<\/b><br \/>\n<b>a1.sources.avroSrc.interceptors.search-replace.replaceString = The hungry $2 ate the careless $1<\/b><\/p>\n<h3><span style=\"font-weight: 400\">viii. Regex Filtering<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">While it comes to filter events selectively we use Regex Filtering Interceptor. Basically, it is possible by interpreting the event body as text and matching the text against a configured regular expression. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, to include events or exclude events we can use the supplied regular expression. The following table shows property name along with descriptions of\u00a0Regex Filtering Flume Interceptors.<\/span><br \/>\nTable.8 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be regex_filter<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">regex<\/span><\/td>\n<td><span style=\"font-weight: 400\">\u201d.*\u201d<\/span><\/td>\n<td><span style=\"font-weight: 400\">Regular expression for matching against events<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">excludeEvents<\/span><\/td>\n<td><span style=\"font-weight: 400\">false<\/span><\/td>\n<td><span style=\"font-weight: 400\">If true, regex determines events to exclude, otherwise, regex determines events to include.<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h3><span style=\"font-weight: 400\">ix. Flume Regex Extractor<\/span><span style=\"font-weight: 400\">: Apache Flume Interceptors<\/span><\/h3>\n<p><span style=\"font-weight: 400\">While it comes to extracts regex match groups we use Regex Extractor. Basically, it is only possible by using a specified regular expression and appends the match groups as headers on the event. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Also, for formatting the match groups before adding them as event headers it supports pluggable serializers. A table shows\u00a0<\/span><span style=\"font-weight: 400\">Regex Extractor Flume Interceptors.<\/span><br \/>\nTable.8 &#8211; Apache Flume Interceptor<\/p>\n<table>\n<tbody>\n<tr>\n<td><strong>Property Name<\/strong><\/td>\n<td><strong>Default<\/strong><\/td>\n<td><strong>Description<\/strong><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">type<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">The component type name has to be regex_extractor<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">regex<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">Regular expression for matching against events<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">serializers<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">Space-separated list of serializers for mapping matches to header names and serializing their values. (See example below) Flume provides built-in support for the following serializers: org.apache.flume.interceptor.RegexExtractorInterceptorPassThroughSerializer org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">serializers.&lt;s1&gt;.type<\/span><\/td>\n<td><span style=\"font-weight: 400\">default<\/span><\/td>\n<td><span style=\"font-weight: 400\">Must be default (org.apache.flume.interceptor.RegexExtractorInterceptorPassThroughSerializer), org.apache.flume.interceptor.RegexExtractorInterceptorMillisSerializer, or the FQCN of a custom class that implements org.apache.flume.interceptor.RegexExtractorInterceptorSerializer<\/span><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">serializers.&lt;s1&gt;.name<\/span><\/td>\n<td><\/td>\n<td><\/td>\n<\/tr>\n<tr>\n<td><span style=\"font-weight: 400\">serializers.*<\/span><\/td>\n<td><\/td>\n<td><span style=\"font-weight: 400\">Serializer-specific properties<\/span><\/td>\n<\/tr>\n<\/tbody>\n<\/table>\n<h2><span style=\"font-weight: 400\">Apache Flume Interceptors &#8211; <\/span>Conclusion<\/h2>\n<p><span style=\"font-weight: 400\">Hence, in this Apache Flume tutorial, we have studied the whole concept of Flume Interceptors. Also, we have seen several types of Flume Interceptors, Timestamp Interceptor, Host Interceptor, Static Interceptor, Remove Header Interceptor, UUID Interceptor, Morphline Interceptor, Search and Replace Interceptor, Regex Filtering Interceptor, Regex Extractor Interceptor. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, we have seen Apache Flume Interceptors examples to completely understand this topic. Futhermore, if you have any doubt, please ask through the comment section.<\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this Apache Flume Tutorial, we talk about Apache Flume interceptors. Interceptors in Flume are those who have the capability to modify\/drop events in-flight. So, in this blog, we will learn the whole concept&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":9227,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[21],"tags":[4825,5825,5826,6872,8870,11466,11467,11511,12668,13781,13782,14737,14738,15318,15715],"class_list":["post-9158","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-flume","tag-flume-interceptors","tag-host-flume-interceptors","tag-host-interceptor","tag-interceptors-in-apache-flume","tag-morphline-interceptor","tag-regex-extractor-interceptor","tag-regex-filtering-interceptor","tag-remove-header-interceptor","tag-search-and-replace-interceptor","tag-static-interceptor","tag-static-interceptors","tag-timestamp-flume-interceptors","tag-timestamp-interceptors","tag-uuid-interceptor","tag-what-is-flume-interceptors"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Flume Interceptors | Types of Interceptors in Flume - DataFlair<\/title>\n<meta name=\"description\" content=\"Apache Flume interceptors-Types of Interceptors in flume,Timestamp interceptors,Static interceptors,UUID,Morphline, Regex Flume interceptors,Flume Kafka\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/flume-interceptors\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Flume Interceptors | Types of Interceptors in Flume - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Apache Flume interceptors-Types of Interceptors in flume,Timestamp interceptors,Static interceptors,UUID,Morphline, Regex Flume interceptors,Flume Kafka\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/flume-interceptors\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-02-23T10:27:01+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-05-09T07:38:26+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Flume Interceptors | Types of Interceptors in Flume - DataFlair","description":"Apache Flume interceptors-Types of Interceptors in flume,Timestamp interceptors,Static interceptors,UUID,Morphline, Regex Flume interceptors,Flume Kafka","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/","og_locale":"en_US","og_type":"article","og_title":"Apache Flume Interceptors | Types of Interceptors in Flume - DataFlair","og_description":"Apache Flume interceptors-Types of Interceptors in flume,Timestamp interceptors,Static interceptors,UUID,Morphline, Regex Flume interceptors,Flume Kafka","og_url":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-02-23T10:27:01+00:00","article_modified_time":"2021-05-09T07:38:26+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"Apache Flume Interceptors | Types of Interceptors in Flume","datePublished":"2018-02-23T10:27:01+00:00","dateModified":"2021-05-09T07:38:26+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/"},"wordCount":2118,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg","keywords":["Flume Interceptors","Host Flume Interceptors","Host Interceptor","Interceptors in apache Flume","Morphline Interceptor","Regex Extractor Interceptor","Regex Filtering Interceptor","Remove Header Interceptor","Search and Replace Interceptor","Static Interceptor","Static Interceptors","Timestamp Flume Interceptors","Timestamp Interceptors","UUID Interceptor","What is Flume Interceptors"],"articleSection":["Flume Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/flume-interceptors\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/","url":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/","name":"Apache Flume Interceptors | Types of Interceptors in Flume - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg","datePublished":"2018-02-23T10:27:01+00:00","dateModified":"2021-05-09T07:38:26+00:00","description":"Apache Flume interceptors-Types of Interceptors in flume,Timestamp interceptors,Static interceptors,UUID,Morphline, Regex Flume interceptors,Flume Kafka","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/flume-interceptors\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Flume-Interceptors-01-1.jpg","width":1200,"height":628,"caption":"What is Apache Flume interceptors"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/flume-interceptors\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"actions","item":"https:\/\/data-flair.training\/blogs\/tag\/actions\/"},{"@type":"ListItem","position":3,"name":"Apache Flume Interceptors | Types of Interceptors in Flume"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9158","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=9158"}],"version-history":[{"count":3,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9158\/revisions"}],"predecessor-version":[{"id":92669,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9158\/revisions\/92669"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/9227"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=9158"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=9158"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=9158"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}