Clustering in Data Mining – Algorithms of Cluster Analysis in Data Mining
1. Data Mining Clustering – Objective
In this blog, we will study Cluster Analysis in Data Mining. First, we will study clustering in data mining and the introduction and requirements of clustering in Data mining. Moreover, we will discuss the applications & algorithm of Cluster Analysis in Data Mining. Further, we will cover Data Mining Clustering Methods and approaches to Cluster Analysis.
So, let’s start exploring Clustering in Data Mining.
2. Introduction to Cluster Analysis
a. What is Clustering in Data Mining?
- Generally, a group of abstract objects into classes of similar objects is made.
- We treat a cluster of data objects as one group.
- While doing cluster analysis, we first partition the set of data into groups. That based on data similarity and then assign the labels to the groups.
- The main advantage of over-classification is that it is adaptable to changes. And helps single out useful features that distinguish different groups.
b. What is Cluster Analysis in Data Mining?
3. Applications of Data Mining Cluster Analysis
- Data Clustering analysis is used in many applications. Such as market research, pattern recognition, data analysis, and image processing.
- Data Clustering can also help marketers discover distinct groups in their customer base. And they can characterize their customer groups based on the purchasing patterns.
- In the field of biology, it can be used to derive plant and animal taxonomies. categorize genes with similar functionalities and gain insight into structures inherent to populations.
- Clustering in Data Mining helps in identification of areas. That is of similar land use in an earth observation database. It also helps in the identification of groups of houses in a city. That is according to house type, value, and geographic location.
- Clustering in Data Mining also helps in classifying documents on the web for information discovery
- Also, we use Data clustering in outlier detection applications. Such as detection of credit card fraud.
- As a data mining function, cluster analysis serves as a tool. That is to gain insight into the distribution of data. Also, need to observe characteristics of each cluster.
4. Requirements of Clustering in Data Mining
b. Ability to deal with different kinds of attributes
c. Discovery of clusters with attribute shape
d. High dimensionality
e. Ability to deal with noisy data
5. Data Mining Clustering Methods
a. Partitioning Clustering Method
- Each group contains at least one object.
- Each object must belong to exactly one group.
- If we have a given number of partitions (say k). Then the partitioning method will create an initial partitioning.
- Further, it uses the iterative relocation technique. That is to improve the partitioning by moving objects from one group to other.
b. Hierarchical Clustering Methods
- Agglomerative Approach
- Divisive Approach
i. Agglomerative Approach
ii. Divisive Approach
- Perform careful analysis of object linkages at each hierarchical partitioning.
- Integrate hierarchical agglomeration by using a hierarchical agglomerative algorithm. Then to group objects into micro-clusters, and then performing macro-clustering on the micro-clusters.
c. Density-Based Clustering Method
d. Grid-Based Clustering Method
- The major advantage of this method is a fast processing time.
- It is dependent only on the number of cells in each dimension in the quantized space.
e. Model-Based Clustering Methods
f. Constraint-Based Clustering Method
6. What is Not Cluster Analysis?
- Supervised classification – Have class label information
- Simple segmentation – Dividing students into different registration groups, by the last name
- Results of a query – Basically, groupings are a result of an external specification
- Graph partitioning – Some mutual relevance and synergy, but areas are not identical
So, this was all about Clustering in Data Mining. Hope you like our explanation.