

{"id":55450,"date":"2019-05-01T17:35:00","date_gmt":"2019-05-01T12:05:00","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=55450"},"modified":"2025-08-03T21:20:23","modified_gmt":"2025-08-03T15:50:23","slug":"k-means-clustering-tutorial","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/","title":{"rendered":"Data Science K-means Clustering &#8211; In-depth Tutorial with Example"},"content":{"rendered":"<div class='__iawmlf-post-loop-links' style='display:none;' data-iawmlf-post-links='[{&quot;id&quot;:1577,&quot;href&quot;:&quot;https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Euclidean_distance&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20251208051209\\\/https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Euclidean_distance&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-09 11:50:52&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-14 15:07:12&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-23 14:52:22&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-02 09:10:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-07 13:11:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-12 19:53:40&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-19 09:08:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-26 22:40:33&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-31 13:17:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-13 10:37:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-16 18:51:37&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-26 23:30:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-02 01:31:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-05 15:28:35&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-10 15:16:35&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-16 09:59:35&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-21 15:39:41&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-25 05:21:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-30 00:27:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-02 04:06:44&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-05 15:39:02&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-09 03:46:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-15 10:03:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-27 18:36:49&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-01 00:14:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-06 13:46:26&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-12 04:51:46&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-15 17:34:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-19 06:17:17&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-26 06:45:25&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-02 09:01:58&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-05 10:35:01&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-08 21:44:27&quot;,&quot;http_code&quot;:404}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-08 21:44:27&quot;,&quot;http_code&quot;:404},&quot;process&quot;:&quot;done&quot;}]'><\/div>\n<p>One of the <a href=\"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms\/\"><strong>most popular Machine Learning algorithms<\/strong><\/a> is K-means clustering. It is an unsupervised learning algorithm, meaning that it is used for unlabeled datasets. Imagine that you have several points spread over an n-dimensional space.<\/p>\n<p>In order to categorize this data on the basis of their similarity, you will use the K-means clustering algorithm. In this article, we will go through this algorithm in detail. Then, we will discuss the basic Python libraries that can be used to implement this algorithm.<\/p>\n<p>K-means clustering algorithm is an unsupervised technique to group data in the order of their similarities. We then find patterns within this data which are present as k-clusters.<\/p>\n<p>These clusters are basically data-points aggregated based on their similarities. Let&#8217;s start K-means Clustering Tutorial with abrief about clustering.<\/p>\n<h3>What is Clustering?<\/h3>\n<p>Imagine that you have a group of chocolates and liquorice candies. You are required to separate the two eatables. Intuitively, you are able to separate them based on their appearances.<\/p>\n<p>The process of segregating objects into groups based on their respective characteristics is called clustering. In clusters, the features of objects in a group are similar to other objects present in the same group.<\/p>\n<p>Clustering is used in various fields like image recognition, pattern analysis, medical informatics, genomics, data compression etc. It is part of the unsupervised learning algorithm in machine learning.<\/p>\n<p>This is because the data-points present are not labelled and there is no explicit mapping of input and outputs. As such, based on the patterns present inside, clustering takes place.<\/p>\n<h3>What is K-means Clustering?<\/h3>\n<p>According to the formal <em>definition of K-means clustering<\/em> &#8211; <em>K-means clustering is an iterative algorithm that partitions a group of data containing n values into k subgroups. Each of the n value belongs to the k cluster with the nearest mean.<\/em><\/p>\n<p>This means that given a group of objects, we partition that group into several sub-groups. These sub-groups are formed on the basis of their similarity and the distance of each data-point in the sub-group with the mean of their centroid.<\/p>\n<p>K-means clustering is the most popular form of an unsupervised learning algorithm. It is easy to understand and implement.<\/p>\n<p>The objective of the K-means clustering is to minimize the <a href=\"https:\/\/en.wikipedia.org\/wiki\/Euclidean_distance\">Euclidean distance<\/a> that each point has from the centroid of the cluster. This is known as <strong>intra-cluster variance<\/strong> and can be minimized using the following squared error function &#8211;<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-55517 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1.jpg\" alt=\"Squared Error Function\" width=\"642\" height=\"336\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1.jpg 642w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Squared-Error-Function-01-1-520x272.jpg 520w\" sizes=\"auto, (max-width: 642px) 100vw, 642px\" \/><\/a>Where J is the objective function of the centroid of the cluster. K are the number of clusters and n are the number of cases. C is the number of centroids and j is the number of clusters.<\/p>\n<p>X is the given data-point from which we have to determine the Euclidean Distance to the centroid. Let us have a look at the algorithm for K-means clustering &#8211;<\/p>\n<p>1. First, we randomly<em> initialize and select <\/em>the k-points. These k-points are the means.<\/p>\n<p>2. We use the<em> Euclidean distance to find data-points<\/em> that are closest to their centreW of the cluster.<\/p>\n<p>3. Then we <em>calculate the mean of all the points<\/em> in the cluster which is finding their centroid.<\/p>\n<p>4. We iteratively <em>repeat step 1, 2 and 3<\/em> until all the points are assigned to their respective clusters.<\/p>\n<p>K-Means is a non-hierarchical clustering method.<\/p>\n<h3>K-Means in Action<\/h3>\n<p>In this section, we will use K-means over random data using Python libraries.<\/p>\n<ul>\n<li>First, we import the <a href=\"https:\/\/data-flair.training\/blogs\/python-libraries\/\"><strong>essential<\/strong> <strong>Python Libraries<\/strong><\/a> required for implementing our k-means algorithm &#8211;<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">import numpy as np\r\nimport pandas as pd\r\nimport matplotlib.pyplot as plt\r\nfrom sklearn.cluster import KMeans<\/pre>\n<ul>\n<li>We then randomly generate 200 values divided in two clusters of 100 data points each.<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">x = -2 * np.random.rand(200,2)\r\nx0 = 1 + 2 * np.random.rand(100,2)\r\nx[100:200, :] = x0<\/pre>\n<ul>\n<li>We proceed to plot our generated random values and obtain the following graph.<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">plt.scatter(x[ : , 0], x[ :, 1], s = 25, color='r')\r\nplt.grid()<\/pre>\n<p>&nbsp;<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-55737\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries.png\" alt=\"k-means in action\" width=\"1366\" height=\"768\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries.png 1366w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries-150x84.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries-300x169.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries-768x432.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries-1024x576.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Importing-Python-Libraries-520x292.png 520w\" sizes=\"auto, (max-width: 1366px) 100vw, 1366px\" \/><\/a><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-55738\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning.png\" alt=\"K-means clustering \" width=\"374\" height=\"252\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning.png 374w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning-150x101.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning-300x202.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Cluster-Partitioning-272x182.png 272w\" sizes=\"auto, (max-width: 374px) 100vw, 374px\" \/><\/a><\/p>\n<p>From the above graph, we observe that about 200 data points have been partitioned in two clusters, where each cluster contains 100 data points.<\/p>\n<ul>\n<li>After plotting our two clusters, we proceed to implement our k-means learning algorithm to establish the centroids for our clusters. We initiate the k, which represents the cluster with a random value of 3.<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Kmean = KMeans(n_clusters=3)\r\nKmean.fit(x)<\/pre>\n<ul>\n<li>After this, we proceed to find the location of the centroids of our two clusters. We obtain the following result after typing the following line of code &#8211;<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Kmean.cluster_centers_<\/pre>\n<ul>\n<li>We then proceed to visualize the centroids of our two clusters:<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">plt.scatter(2.03078996, \u00a02.05446538, s=100, color='green')\r\nplt.show()<\/pre>\n<p>We obtain the following output &#8211;<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-55739\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids.png\" alt=\"Establishing Centroids of Clusters\" width=\"1366\" height=\"619\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids.png 1366w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids-150x68.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids-300x136.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids-768x348.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids-1024x464.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Establishing-Centroids-520x236.png 520w\" sizes=\"auto, (max-width: 1366px) 100vw, 1366px\" \/><\/a><\/p>\n<p>Now, we obtain the following graph &#8211;<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-55740\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters.png\" alt=\"Centroids for two clusters\" width=\"374\" height=\"252\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters.png 374w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters-150x101.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters-300x202.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Centroids-for-two-clusters-272x182.png 272w\" sizes=\"auto, (max-width: 374px) 100vw, 374px\" \/><\/a><\/p>\n<p>&nbsp;<\/p>\n<ul>\n<li>In the above visualization, we obtain the centroids for our two clusters. Now, we will test our model. In the testing phase, we will first display the labels that are distributed across our two labels (0,1) which represent the clusters.<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">Kmean.labels_<\/pre>\n<p>We can clearly observe from the above output that 100 values belong to label 0 and 100 values belong to label 1.<\/p>\n<ul>\n<li>Now we predict the cluster for a given data point located at position (4,5) in our 2-dimensional space.<\/li>\n<\/ul>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">sample_test=np.array([4.0,5.0])\r\nsecond_test=sample_test.reshape(1, -1)\r\nKmean.predict(second_test)<\/pre>\n<p>We obtain the following output &#8211;<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-55741\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster.png\" alt=\"K-means clustering \" width=\"1366\" height=\"560\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster.png 1366w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster-150x61.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster-300x123.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster-768x315.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster-1024x420.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/Predicting-Cluster-520x213.png 520w\" sizes=\"auto, (max-width: 1366px) 100vw, 1366px\" \/><\/a><\/p>\n<p>From the above code, we can conclude that K-means clustering is easy to understand and an easy to implement algorithm. We will now take a look at some of the practical applications of K-means clustering.<\/p>\n<p><strong>You must take a look at<\/strong> <a href=\"https:\/\/data-flair.training\/blogs\/python-for-data-science\/\"><strong>why Python is must for Data Scientists<\/strong><\/a><\/p>\n<h3>Applications of K-Means Clustering Algorithm<\/h3>\n<p>1. K-means algorithm is used in the business sector for identifying segments of purchases made by the users. It is also used to cluster activities on websites and applications.<\/p>\n<p>2. It is used as a form of lossy image compression technique. In image compression, K-means is used to cluster pixels of an image that reduce the overall size of it.<\/p>\n<p>3. It is also used in document clustering to find relevant documents in one place.<\/p>\n<p>4. K-means is used in the field of insurance and fraud detection. Based on the previous historical data, it is possible to cluster fraudulent practices and claims based on their closeness towards clusters that indicate patterns of fraud.<\/p>\n<p>5. It is also used to classify sounds based on their similar patterns and isolating deformities in speech.<\/p>\n<p>6. K-means clustering is used for Call Detail Record (CDR) Analysis. It provides an in-depth insight into the customer requirements based on the call-traffic during the time of the day and demographics of the place.<\/p>\n<h3>Summary<\/h3>\n<p>K-means clustering is an unsupervised learning algorithm used in data science to group data points into distinct clusters. It works by placing \u2018k\u2019 number of centroids and assigning each data point to the nearest one. The centroids are then updated by calculating the mean of all points in the cluster. This process continues until the clusters stop changing. The result is a set of groups where each group shares similar characteristics.<\/p>\n<p>So, in this K-means clustering tutorial, we went through the basics of it. We understood its definition and the algorithm that is used. We also went through the code implementation using Python Libraries. In the end, we went through the real-life applications of K-means clustering.<\/p>\n<p>As a Data Scientist, having knowledge of this clustering algorithm is essentially important. Since it teaches you to deal with unlabeled data, it is a must-have skill for any budding data scientist.<\/p>\n<p><span id=\":vw.co\" class=\"tL8wMe EMoHub\" dir=\"ltr\">K-means is a Machine Learning Algorithm that forms a part of a much larger pool of data operations known as Data Science. This is the right time to explore <a href=\"https:\/\/data-flair.training\/blogs\/what-is-data-science\/\"><strong>everything about<\/strong> <strong>Data Science<\/strong><\/a>. <\/span><\/p>\n<p>Hope the tutorial was helpful. If there is anything we missed out, do let us know through comments.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>One of the most popular Machine Learning algorithms is K-means clustering. It is an unsupervised learning algorithm, meaning that it is used for unlabeled datasets. Imagine that you have several points spread over an&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":55513,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[19],"tags":[19647,19649,19648,19650],"class_list":["post-55450","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-science","tag-k-means-clustering","tag-k-means-clustering-example","tag-k-means-clustering-tutorial","tag-what-is-clustering"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Data Science K-means Clustering - In-depth Tutorial with Example - DataFlair<\/title>\n<meta name=\"description\" content=\"Learn what is K-means Clustering with simple explanation. Here you will find the example of k-means clustering using random data\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Science K-means Clustering - In-depth Tutorial with Example - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Learn what is K-means Clustering with simple explanation. Here you will find the example of k-means clustering using random data\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2019-05-01T12:05:00+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-08-03T15:50:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"802\" \/>\n\t<meta property=\"og:image:height\" content=\"420\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"7 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data Science K-means Clustering - In-depth Tutorial with Example - DataFlair","description":"Learn what is K-means Clustering with simple explanation. Here you will find the example of k-means clustering using random data","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/","og_locale":"en_US","og_type":"article","og_title":"Data Science K-means Clustering - In-depth Tutorial with Example - DataFlair","og_description":"Learn what is K-means Clustering with simple explanation. Here you will find the example of k-means clustering using random data","og_url":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2019-05-01T12:05:00+00:00","article_modified_time":"2025-08-03T15:50:23+00:00","og_image":[{"width":802,"height":420,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"7 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Data Science K-means Clustering &#8211; In-depth Tutorial with Example","datePublished":"2019-05-01T12:05:00+00:00","dateModified":"2025-08-03T15:50:23+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/"},"wordCount":1200,"commentCount":7,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg","keywords":["K-means Clustering","K-means Clustering Example","K-means Clustering Tutorial","What is clustering"],"articleSection":["Data Science Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/","url":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/","name":"Data Science K-means Clustering - In-depth Tutorial with Example - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg","datePublished":"2019-05-01T12:05:00+00:00","dateModified":"2025-08-03T15:50:23+00:00","description":"Learn what is K-means Clustering with simple explanation. Here you will find the example of k-means clustering using random data","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/05\/K-means-Clustering-01-1.jpg","width":802,"height":420,"caption":"K-means Clustering-01"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/k-means-clustering-tutorial\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Data Science Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/data-science\/"},{"@type":"ListItem","position":3,"name":"Data Science K-means Clustering &#8211; In-depth Tutorial with Example"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/55450","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=55450"}],"version-history":[{"count":9,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/55450\/revisions"}],"predecessor-version":[{"id":146521,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/55450\/revisions\/146521"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/55513"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=55450"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=55450"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=55450"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}