

{"id":15008,"date":"2018-05-27T06:05:55","date_gmt":"2018-05-27T00:35:55","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=15008"},"modified":"2022-01-27T21:08:17","modified_gmt":"2022-01-27T15:38:17","slug":"kafka-connect","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/kafka-connect\/","title":{"rendered":"Apache Kafka Connect &#8211; A Complete Guide"},"content":{"rendered":"<p>Today, we are going to discuss Apache Kafka Connect. This Kafka Connect article carries information about types of Kafka Connector, features and limitations of Kafka Connect.<\/p>\n<p>Moreover, we will learn the need for Kafka Connect and its configuration. Along with this, we will discuss different modes and Rest API.<\/p>\n<p><span style=\"font-weight: 400\">In this Kafka Connect Tutorial, we will study how to import data from external systems into <strong>Apache Kafka<\/strong> topics, and also to export data from Kafka topics into external systems, we have another component of the Apache Kafka project, that is Kafka Connect. <\/span><\/p>\n<p><span style=\"font-weight: 400\">However, there is much more to learn about Kafka Connect.<\/span><\/p>\n<p>So, let&#8217;s start Kafka Connect.<\/p>\n<h3><span style=\"font-weight: 400\">What is Kafka Connect?<\/span><\/h3>\n<p><span style=\"font-weight: 400\">We use Apache Kafka Connect for streaming data between Apache Kafka and other systems, scalably as well as reliably. Moreover, connect makes it very simple to quickly define Kafka connectors that move large collections of data into and out of Kafka.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Kafka Connect collects metrics or takes the entire database from application servers into Kafka Topic. It can make available data with low latency for Stream processing.<\/span><\/p>\n<div id=\"attachment_15010\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-15010\" class=\"wp-image-15010 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect.png\" alt=\"Kafka Connect\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect.png 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-150x79.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-300x157.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-768x402.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-1024x536.png 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-15010\" class=\"wp-caption-text\">Working &#8211; Apache Kafka Connect<\/p><\/div>\n<h3>Kafka Connect Features<\/h3>\n<p><span style=\"font-weight: 400\">There are following features of Kafka Connect:<\/span><\/p>\n<div id=\"attachment_15600\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-15600\" class=\"wp-image-15600 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01.jpg\" alt=\"Kafka Connect\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Kafka-Connect-Features-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-15600\" class=\"wp-caption-text\">Kafka Connect &#8211; Features<\/p><\/div>\n<p><strong>a.\u00a0A common framework for Kafka connectors<\/strong><\/p>\n<p><span style=\"font-weight: 400\">It standardizes the integration of other data systems with Kafka. Also, simplifies connector development, deployment, and management.<\/span><\/p>\n<p><strong>b.\u00a0Distributed and standalone modes<\/strong><\/p>\n<p><span style=\"font-weight: 400\">Scale up to a large, centrally managed service supporting an entire organization or scale down to development, testing, and small production deployments.<\/span><\/p>\n<p><strong>c.\u00a0REST interface<\/strong><\/p>\n<p><span style=\"font-weight: 400\">By an easy to use REST API, we can submit and manage connectors to our Kafka Connect cluster.<\/span><\/p>\n<p><strong>d.\u00a0Automatic offset management<\/strong><\/p>\n<p><span style=\"font-weight: 400\">However, Kafka Connect can manage the offset commit process automatically even with just a little information from connectors. Hence, connector developers do not need to worry about this error-prone part of connector development.<\/span><\/p>\n<p><strong>e.\u00a0Distributed and scalable by default<\/strong><\/p>\n<p><span style=\"font-weight: 400\">It builds upon the existing group management protocol. And to scale up a Kafka Connect cluster we can add more workers.<\/span><\/p>\n<p><strong>f.\u00a0Streaming\/batch integration<\/strong><\/p>\n<p><span style=\"font-weight: 400\">We can say for bridging streaming and batch data systems, Kafka Connect is an ideal solution.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Why Kafka Connect?<\/span><\/h3>\n<p><span style=\"font-weight: 400\">As we know, like <strong>F<\/strong><\/span><strong>lume<\/strong>, there are many tools which are capable of writing to Kafka or reading from Kafka or also can import and export data. So, the question occurs, why do we need Kafka Connect. Hence, here we are listing the primary advantages:<\/p>\n<div id=\"attachment_15598\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-15598\" class=\"wp-image-15598 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1.jpg\" alt=\"Kafka Connect\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Why-Kafka-Connect-01-1-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-15598\" class=\"wp-caption-text\">Why Kafka Connect- Need for Kafka<\/p><\/div>\n<h4>a.\u00a0Auto-recovery After Failure<\/h4>\n<p><span style=\"font-weight: 400\">To each record, a \u201csource\u201d connector can attach arbitrary \u201csource location\u201d information which it passes to Kafka Connect. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Hence, at the time of failure Kafka Connect will automatically provide this information back to the connector. In this way, it can resume where it failed. Additionally, auto recovery for \u201csink\u201d connectors is even easier.<\/span><\/p>\n<h4>b.\u00a0Auto-failover<\/h4>\n<p><span style=\"font-weight: 400\">Auto-failover is possible because the Kafka Connect nodes build a Kafka cluster. That means\u00a0if suppose one node fails the work that it is doing is redistributed to other nodes.<\/span><\/p>\n<h4>c.\u00a0Simple Parallelism<\/h4>\n<p><span style=\"font-weight: 400\">A connector can define data import or export tasks, especially which execute in parallel.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Kafka Connect Concepts<\/span><\/h3>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">An operating-system process (<strong>Java<\/strong>-based) which executes connectors and their associated tasks in child threads, is what we call a <strong>Kafka Connect<\/strong>\u00a0<\/span><b>worker<\/b><span style=\"font-weight: 400\">.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Also, there is an object that defines parameters for one or more tasks which should actually do the work of importing or exporting data, is what we call a <\/span><b>connector<\/b><span style=\"font-weight: 400\">.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">To read from some arbitrary input and write to Kafka, a <\/span><b>source<\/b> <b>connector<\/b><span style=\"font-weight: 400\"> generates tasks.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">In order to read from Kafka and write to some arbitrary output, a <\/span><b>sink<\/b><span style=\"font-weight: 400\"> connector generates tasks.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">However, we can say Kafka Connect is not\u00a0an option for significant data transformation. In spite of all, to define basic data transformations, the most recent versions of Kafka Connect allow the configuration parameters for a connector. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Whereas, for \u201csource\u201d connectors, this function\u00a0considers that the tasks transform their input into AVRO or JSON format; the transformation is applied just before writing the record to a Kafka topic. <\/span><\/p>\n<p><span style=\"font-weight: 400\">And, while it comes to \u201csink\u201d connectors, this function considers that data on the input Kafka topic is already in AVRO or JSON format. <\/span><\/p>\n<h3><span style=\"font-weight: 400\">Dependencies of Kafka Connect<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Kafka Connect nodes require a connection to a Kafka message-broker cluster, whether run in stand-alone or distributed mode.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Basically, there are no other dependencies, for distributed mode. Even when the connector configuration settings are stored in a Kafka message topic, Kafka Connect nodes are completely stateless. Due to this, Kafka Connect nodes, it becomes very suitable for running via technology.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Although to store the \u201ccurrent location\u201d and the connector configuration, we need a small amount of local disk storage, for standalone mode.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Distributed Mode<\/span><\/h3>\n<p><span style=\"font-weight: 400\">By using a Kafka Broker address, we can start a Kafka Connect worker instance (i.e. a java process), the names of several Kafka topics for \u201cinternal use\u201d and a \u201cgroup id\u201d parameter.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">By the \u201cinternal use\u201d Kafka topics, each worker instance coordinates with other worker instances belonging to the same group-id. Here, everything is done via the Kafka message broker, no other external coordination mechanism is needed (no Zookeeper, etc).<\/span><\/p>\n<p><span style=\"font-weight: 400\">The workers negotiate between themselves (via the topics) on how to distribute the set of connectors and tasks across the available set of workers. <\/span><\/p>\n<p><span style=\"font-weight: 400\">If a worker process dies, the cluster is rebalanced to distribute the work fairly over the remaining workers. <\/span><span style=\"font-weight: 400\">If a new worker starts work, a rebalance ensures it takes over some work from the existing workers.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Standalone Mode<\/span><\/h3>\n<p><span style=\"font-weight: 400\">We can say, it is simply distributed-mode, where a worker instance uses no internal topics within the Kafka message <strong>broker<\/strong>. This process runs all specified connectors, and their generated tasks, itself (as threads).<\/span><\/p>\n<p><span style=\"font-weight: 400\">Because standalone mode stores current source offsets in a local file, it does not use Kafka Connect \u201cinternal topics\u201d for storage. As a command line option,\u00a0information about the connectors to execute is provided, in standalone mode.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, in this mode, running a connector can be valid for production systems; through this way, we execute\u00a0most ETL-style workloads traditionally since the past.<\/span><\/p>\n<p>So, here again, we are\u00a0managing failover\u00a0in the traditional way &#8211; e.g by scripts starting an alternate instance.<\/p>\n<p><strong>a.\u00a0Launching a Worker<\/strong><br \/>\n<span style=\"font-weight: 400\">A worker instance is simply a Java process. Usually, it is launched via a provided shell-script. Then, from its CLASSPATH the worker instance loads whichever custom connectors are specified by the connector configuration.<\/span><\/p>\n<p><span style=\"font-weight: 400\"> For standalone mode, the configuration is provided on the command line\u00a0and for distributed mode read from a Kafka topic.<\/span><\/p>\n<p><span style=\"font-weight: 400\">For launching a Kafka Connect worker, there is also a standard Docker container image. So, any number of instances of this image can be launched and also will automatically federate together as long as they are configured with the same Kafka message broker cluster and group-id.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">REST API<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Basically, each worker instance starts an embedded web server. So, through\u00a0that, it exposes a REST API for status-queries and configuration. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, configuration uploaded via this REST API is saved in internal Kafka message broker topics, for workers in distributed mode. However, the configuration REST APIs are not relevant, for workers in standalone mode.<\/span><\/p>\n<p><span style=\"font-weight: 400\">By wrapping the worker REST API, the Confluent Control Center provides much of its Kafka-connect-management UI.<\/span><\/p>\n<p><span style=\"font-weight: 400\">To periodically obtain system status, Nagios or REST calls could perform monitoring of Kafka Connect daemons potentially.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Kafka Connector Types<\/span><\/h3>\n<p><span style=\"font-weight: 400\">By implementing a specific Java interface, it is possible to create a connector. We have a set of existing connectors, or also a facility that we can write custom ones for us.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Its worker simply expects the implementation for any connector and task classes it executes to be present in its classpath. However, without the benefit of child classloaders, this code is loaded directly into the application, an OSGi framework, or similar.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">There are several connectors available in the \u201cConfluent Open Source Edition\u201d download package, they are:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">JDBC<\/span><\/li>\n<li style=\"font-weight: 400\"><strong>HDFS<\/strong><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">S3<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Elasticsearch<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">However, there is no way to download these connectors individually, but we can extract them from\u00a0Confluent Open Source as they are open-source, also we can download and copy it into a standard Kafka installation.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Configuring Kafka Connect<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Generally, with a command line option pointing to a config-file containing options for the worker instance, each worker instance starts. For example Kafka message broker details, group-id.<\/span><\/p>\n<p><span style=\"font-weight: 400\">However, a worker is also given a command line option pointing to a config-file defining the connectors to be executed, in a standalone mode. Whereas, each worker instead retrieves connector\/task configuration from a Kafka topic (specified in the worker config file), in distributed mode.\u00a0<\/span><\/p>\n<p><span style=\"font-weight: 400\">Also, a worker process provides a REST API for status-checks etc, in standalone mode.<\/span><br \/>\n<span style=\"font-weight: 400\">Moreover, to pause and resume connectors, we can use the REST API.<\/span><\/p>\n<p><span style=\"font-weight: 400\">It is very important to note that Configuration options \u201ckey.converter\u201d and \u201cvalue.converter\u201d options are not connector-specific, they are worker-specific.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Connections from Kafka Connect Workers to Kafka Brokers<\/span><\/h3>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-49590\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image.png\" alt=\"Connections from Kafka Connect Workers to Kafka Brokers\" width=\"793\" height=\"460\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image.png 793w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image-150x87.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image-300x174.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image-768x445.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/image-520x302.png 520w\" sizes=\"auto, (max-width: 793px) 100vw, 793px\" \/><\/a><\/p>\n<p><span style=\"font-weight: 400\">For administrative purposes, each worker establishes a connection to the Kafka message broker cluster in distributed mode. However, in the worker configuration file, we define these settings as \u201ctop level\u201d settings.<\/span><\/p>\n<p><span style=\"font-weight: 400\">Moreover, a separate connection (set of sockets) to the Kafka message broker cluster is established, for each connector. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Many of the settings are inherited from the \u201ctop level\u201d Kafka settings, but they can be overridden with config prefix \u201cconsumer.\u201d (used by sinks) or \u201cproducer.\u201d (used by sources) in order to use different Kafka message broker network settings for connections carrying production data vs connections carrying admin messages.\u00a0<\/span><\/p>\n<h3><span style=\"font-weight: 400\">The Standard JDBC Source Connector<\/span><\/h3>\n<p><span style=\"font-weight: 400\">The connector hub site lists a JDBC source connector, and this connector is part of the Confluent Open Source download. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Also, make sure we cannot download it separately, so for users who have installed the \u201cpure\u201d Kafka bundle from Apache instead of the Confluent bundle, must extract this connector from the Confluent bundle and copy it over.<\/span><\/p>\n<p><span style=\"font-weight: 400\">There are various configuration options for it:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A database to scan, specified as a JDBC URL.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A poll interval.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">A regular expression specifying which tables to watch; for each table, a separate Kafka topic is\u00a0there.\u00a0<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">An<strong> SQL<\/strong> column which has an \u201cincrementing id\u201d, in which case the connector can detect new records (select where id &gt; last-known-id).<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">An SQL column with an updated-timestamp in which case the connector can detect new\/modified records (select where timestamp &gt; last-known-timestamp).<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Strangely, although the connector is apparently designed with the ability to copy, multiple tables, the \u201cincrementing id\u201d and \u201ctimestamp\u201d column-names are global &#8211; i.e. when multiple tables are being copied then they must all follow the same naming convention for these columns.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Kafka Connect Security<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Basically, with Kerberos-secured Kafka message brokers, Kafka Connect (v0.10.1.0) works very fine. Also works fine with SSL-encrypted connections to these brokers.<\/span><\/p>\n<p><span style=\"font-weight: 400\">However, via either Kerberos or SSL, it is not possible to protect the REST API which Kafka Connect nodes expose; though, there is a feature-request for this. <\/span><\/p>\n<p><span style=\"font-weight: 400\">Hence, it is essential to configure an external proxy (eg Apache HTTP) to act as a secure gateway to the REST services, when configuring a secure cluster.<\/span><\/p>\n<h3><span style=\"font-weight: 400\">Limitations of Kafka Connect<\/span><\/h3>\n<p><span style=\"font-weight: 400\">Apart from all, Kafka Connect has some limitations too:<\/span><\/p>\n<ul>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">At the current time, there is a very\u00a0less selection of connectors.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Separation of commercial and open-source features is very poor. <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">Also, it lacks configuration tools.<\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">To deploying custom connectors (plugins), there is a poor\/primitive approach. <\/span><\/li>\n<li style=\"font-weight: 400\"><span style=\"font-weight: 400\">It is very much Java\/Scala centric.<\/span><\/li>\n<\/ul>\n<p><span style=\"font-weight: 400\">Hence, currently, it feels more like a \u201cbag of tools\u201d than a packaged solution at the current time &#8211; at least without purchasing commercial tools.<\/span><\/p>\n<p>So, this was all about Apache Kafka Connect. Hope you like our explanation.<\/p>\n<h3><span style=\"font-weight: 400\">Summary<\/span><\/h3>\n<p>Hence, we have seen the whole concept of <strong>Kafka<\/strong> Connect. Also, we have learned the benefits of Kafka connect. However, if any doubt occurs, feel free to ask in the comment section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Today, we are going to discuss Apache Kafka Connect. This Kafka Connect article carries information about types of Kafka Connector, features and limitations of Kafka Connect. Moreover, we will learn the need for Kafka&#46;&#46;&#46;<\/p>\n","protected":false},"author":5,"featured_media":15593,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[9],"tags":[832,2881,4609,7866,9027,16159],"class_list":["post-15008","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-kafka","tag-apache-kafka-connect","tag-configuring-kafka-connect","tag-features-of-kafka-connect","tag-kafka-connect-limitations","tag-need-for-kafka-connect","tag-why-kafka-connect"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Apache Kafka Connect - A Complete Guide - DataFlair<\/title>\n<meta name=\"description\" content=\"Kafka Connect,Features-limitations &amp; need of Kafka Connect,Rest API,Configuring Kafka Connect,JDBC,standalone mode,distributed mode,kafka connect connectors\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/kafka-connect\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Apache Kafka Connect - A Complete Guide - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Kafka Connect,Features-limitations &amp; need of Kafka Connect,Rest API,Configuring Kafka Connect,JDBC,standalone mode,distributed mode,kafka connect connectors\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/kafka-connect\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-05-27T00:35:55+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2022-01-27T15:38:17+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Apache Kafka Connect - A Complete Guide - DataFlair","description":"Kafka Connect,Features-limitations & need of Kafka Connect,Rest API,Configuring Kafka Connect,JDBC,standalone mode,distributed mode,kafka connect connectors","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/kafka-connect\/","og_locale":"en_US","og_type":"article","og_title":"Apache Kafka Connect - A Complete Guide - DataFlair","og_description":"Kafka Connect,Features-limitations & need of Kafka Connect,Rest API,Configuring Kafka Connect,JDBC,standalone mode,distributed mode,kafka connect connectors","og_url":"https:\/\/data-flair.training\/blogs\/kafka-connect\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-05-27T00:35:55+00:00","article_modified_time":"2022-01-27T15:38:17+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/7f83c342f5d1632d6f7b4b0b0f447823"},"headline":"Apache Kafka Connect &#8211; A Complete Guide","datePublished":"2018-05-27T00:35:55+00:00","dateModified":"2022-01-27T15:38:17+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/"},"wordCount":1997,"commentCount":2,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg","keywords":["Apache Kafka Connect","Configuring kafka Connect","features of kafka connect","Kafka Connect limitations","Need for Kafka Connect","Why Kafka Connect"],"articleSection":["Apache Kafka Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/kafka-connect\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/","url":"https:\/\/data-flair.training\/blogs\/kafka-connect\/","name":"Apache Kafka Connect - A Complete Guide - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg","datePublished":"2018-05-27T00:35:55+00:00","dateModified":"2022-01-27T15:38:17+00:00","description":"Kafka Connect,Features-limitations & need of Kafka Connect,Rest API,Configuring Kafka Connect,JDBC,standalone mode,distributed mode,kafka connect connectors","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/kafka-connect\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/05\/Apache-kafka-Connect-01-1.jpg","width":1200,"height":628,"caption":"Apache Kafka Connect"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/kafka-connect\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Apache Kafka Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/kafka\/"},{"@type":"ListItem","position":3,"name":"Apache Kafka Connect &#8211; A Complete Guide"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/7f83c342f5d1632d6f7b4b0b0f447823","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team creates expert-level guides on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our goal is to empower learners with easy-to-understand content. Explore our resources for career growth and practical learning.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam1\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15008","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=15008"}],"version-history":[{"count":2,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15008\/revisions"}],"predecessor-version":[{"id":107328,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/15008\/revisions\/107328"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/15593"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=15008"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=15008"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=15008"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}