

{"id":7149,"date":"2018-02-08T13:11:03","date_gmt":"2018-02-08T13:11:03","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=7149"},"modified":"2025-04-06T21:36:42","modified_gmt":"2025-04-06T16:06:42","slug":"data-mining-tutorial","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/","title":{"rendered":"Data Mining Tutorial &#8211; Introduction to Data Mining (Complete Guide)"},"content":{"rendered":"<p>In this Data Mining Tutorial, we will study\u00a0what is Data Mining. Also, will study data mining scope, foundation, data mining techniques and terminologies in Data Mining. As we study this, will learn data mining architecture with a diagram.<\/p>\n<p>Further, will study knowledge discovery. Along with we will also learn data mining applications and pros and cons.<\/p>\n<p>So, let&#8217;s start the Data Mining Tutorial.<\/p>\n<h3>What is Data Mining?<\/h3>\n<p><strong>Data Mining<\/strong> is a set of method that applies to large and complex databases. This is to <span class=\"complexword\">eliminate<\/span> the randomness and discover the hidden pattern. As these <em>data mining methods<\/em> are almost always <span class=\"adverb\">computationally<\/span> intensive.<\/p>\n<p>We use<strong> data mining tools<\/strong>, methodologies, and theories for revealing patterns in data. There are too many driving forces present. And, this is the reason why data mining has become such an important area of study.<\/p>\n<h3>Data Mining History<\/h3>\n<p>In 1960s statisticians used the terms \u201cData Fishing\u201d or \u201cData Dredging\u201d. That was to refer to what they considered the bad practice of analyzing data. The term \u201cData Mining\u201d appeared around 1990 in the database community.<\/p>\n<h3>Data Mining Foundation<\/h3>\n<p>We use <strong>data mining techniques<\/strong> for a long process of research and product development. As this evolution <span class=\"passivevoice\">was started<\/span> when business data was first stored on computers.<\/p>\n<p>Also, it allows users to navigate through their data in real time. We use data mining in the business community because it <span class=\"passivevoice\">is supported by<\/span> three technologies that are now mature:<\/p>\n<ul>\n<li>Massive data collection<\/li>\n<\/ul>\n<ul>\n<li>Powerful multiprocessor computers<\/li>\n<\/ul>\n<ul>\n<li>Data mining algorithms<\/li>\n<\/ul>\n<h3>Why Data Mining?<\/h3>\n<p>As data mining is having spacious applications. Thus, it is the young and promising field for the present generation. It has attracted a great deal of attention in the information industry and in society.<\/p>\n<p>Due to the wide availability of huge amounts of data and the imminent need for turning such data into useful information and knowledge. Thus, we use information and knowledge for applications ranging from market analysis. This is the reason why data mining,\u00a0known as knowledge discovery from data.<\/p>\n<h3>Type of Data Gathered<\/h3>\n<p>In this part of the Data Mining Tutorial, we will discuss the types of data gathered in data mining:<\/p>\n<div id=\"attachment_34246\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-34246\" class=\"size-full wp-image-34246\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01.jpg\" alt=\"Data Mining Tutorial - Type of Data Gathered\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Type-of-Data-Gathered-in-Data-Mining-01-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-34246\" class=\"wp-caption-text\">Data Mining Tutorial &#8211; Type of Data Gathered<\/p><\/div>\n<h4>a. Business Transactions<\/h4>\n<p>In this business industry, every transaction is &#8220;memorized&#8221; for perpetuity. We can say many transactions are dealing with time and can be inter-business deals such as purchases, exchanges, banking, stock, etc.,<\/p>\n<h4>b. Scientific Data<\/h4>\n<p>Everywhere, our society is amassing colossal amounts of scientific data. As that scientific data need to <span class=\"passivevoice\">be analyzed<\/span>. Unfortunately, we have to capture and store more new data faster. Then we can analyze the old data already accumulated.<\/p>\n<h4>c. Medical and Personal Data<\/h4>\n<p>As we can say from the government to customer and for personal needs, we have to gather large information. That information <span class=\"passivevoice\">is required<\/span> for individuals and groups.<\/p>\n<p>When correlated with other data, this information can shed light on customer behaviour.<\/p>\n<h4>d. Surveillance Video and Pictures<\/h4>\n<p>As with the collapse of video camera prices, video cameras are becoming ubiquitous. Also, we can recycle cameras, videotapes from surveillance. <span class=\"complexword\">However<\/span>, it\u2019s become a trend to store the tapes and even digitize them for future use and analysis.<\/p>\n<h4>e. Games<\/h4>\n<p>In societies, a huge amount of data and statistics <span class=\"passivevoice\">is used<\/span>. That is to collect games, players, and athletes. As this information data <span class=\"passivevoice\">is used by<\/span> commentators and journalists for reporting.<\/p>\n<h4>f. Digital Media<\/h4>\n<p>There are too many reasons for causes of the explosion in digital media repositories. Such as cheap scanners, desktop video cameras, and digital cameras. Associations such as the NHL and the NBA. That has already started converting their huge game collection into digital forms.<\/p>\n<h4>g. CAD and Software Engineering Data<\/h4>\n<p>There are <span class=\"complexword\">multiple<\/span> CAD systems for architects present to design building. As these systems <span class=\"passivevoice\">are used<\/span> to generate a huge amount of data.<\/p>\n<p>Moreover, we can use S.E is a source of considerable similar data with code and objects that needs to be powerful tools for management and maintenance.<\/p>\n<h4>h. Virtual Worlds<\/h4>\n<p>Nowadays many applications are using three-dimensional virtual spaces. Also, these spaces and the objects they contain have to<span class=\"passivevoice\"> describe<\/span> with special languages such as VRML. <span class=\"adverb\">Ideally<\/span>, we have to define virtual spaces as they can share objects and places. Also, there present the remarkable amount of virtual reality object available.<\/p>\n<h4>i. Text reports and memos (e-mail messages)<\/h4>\n<p>As communications <span class=\"passivevoice\">are based<\/span> on the reports and memos in textual forms in many companies. As they <span class=\"passivevoice\">are exchanged by<\/span> e-mail. Although, we use to store it in digital form for future use. Also, reference creating formidable digital libraries.<\/p>\n<h3>Uses of Data Mining<\/h3>\n<p>Following are the uses of Data Mining, let&#8217;s discuss them one by one:<\/p>\n<p><strong>a. Automated Prediction of Trends and behaviours<\/strong><\/p>\n<p>We use to automate the process of finding predictive information in large databases. Questions that required extensive hands-on analysis can now <span class=\"passivevoice\">be answered<\/span> from the data.<\/p>\n<p>Targeted marketing is a typical example of predictive marketing. As we also use data mining on past promotional mailings. That is to identify the targets to maximize return on investment in future mailings.<\/p>\n<p>Other predictive problems include forecasting bankruptcy and other forms of default. And identifying segments of a population likely to respond <span class=\"adverb\">similarly<\/span> to given events.<\/p>\n<p><strong>b. Automated Discovery of P<span class=\"complexword\">reviously<\/span>\u00a0Unknown Patterns<\/strong><\/p>\n<p>As we use data mining tools to sweep through databases. Also, to identify <span class=\"complexword\">previously<\/span> hidden patterns in one step. There is a very good example of pattern discovery. As it is the analysis of retail sales data. That to identify unrelated products that often purchase together.<\/p>\n<p>Also, there are other pattern discovery problems. That includes detecting fraudulent credit card transactions. It <span class=\"passivevoice\">is identified<\/span> that anomalous data could represent data entry keying errors.<\/p>\n<h3>Data Mining Techniques<\/h3>\n<p>Here, in this session of Data Mining Tutorial, we will explore the techniques used in Data Mining:<\/p>\n<div id=\"attachment_34248\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-34248\" class=\"size-full wp-image-34248\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2.jpg\" alt=\"Data Mining Tutorial - Data Mining Techniques\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Techniques-01-2-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-34248\" class=\"wp-caption-text\">Data Mining Tutorial &#8211; Data Mining Techniques<\/p><\/div>\n<h4>a. Artificial Neural Networks<\/h4>\n<p>We use data mining in non-linear predictive models. As this learn through training and resemble biological neural networks in structure.<\/p>\n<h4>b. Decision Trees<\/h4>\n<p>As we use tree-shaped structures to represent sets of decisions. Also, these rules <span class=\"passivevoice\">are generated<\/span> for the classification of a dataset. These decisions generate rules for the classification of a dataset.<\/p>\n<p>As there are specific decision tree methods that include Classification and Regression Trees and Chi-Square Automatic Interaction Detection (CHAID).<\/p>\n<h4>c. Genetic Algorithms<\/h4>\n<p>There are the present genetic combination, mutation, and natural selection for optimization techniques. That is design based on the concepts of evolution.<\/p>\n<h4>d. Nearest Neighbor Method<\/h4>\n<p>A technique that classifies each record in a dataset based on a combination of the classes of the k record(s) like. it in a historical dataset (where k \u00b3 1). Sometimes called the k-nearest neighbour technique.<\/p>\n<h4>e. Rule Induction<\/h4>\n<p>The extraction of useful if-then rules from data based on statistical significance.<\/p>\n<h3>Data Mining Terminologies<\/h3>\n<p>In this Data Mining Tutorial, we will learn some basic and important terms used in Data Mining:<\/p>\n<h4>a. Notation<\/h4>\n<p><strong>Input X<\/strong>: X is often multidimensional.<\/p>\n<p>Each dimension of X <span class=\"passivevoice\">is denoted by<\/span> Xj and <span class=\"passivevoice\">is referred<\/span> to as a feature variable or,\u00a0 variable.<\/p>\n<p><strong>Output Y<\/strong>: called the response or dependent variable.<\/p>\n<p>A response is available only when learning <span class=\"passivevoice\">is supervised<\/span>.<\/p>\n<h4>b.\u00a0Nature of Data Sets<\/h4>\n<p><strong>i. Quantitative: <\/strong>Measurements or counts, recorded as numerical values, e.g. Height, Temperature, # of Red M&amp;M\u2019s in a bag.<br \/>\n<strong>ii. Qualitative:<\/strong> Group or categories<br \/>\n<strong>iii. Ordinal:<\/strong>\u00a0Possesses a natural ordering, e.g. Shirt sizes (S, M, L, XL)<\/p>\n<p><strong>iv. Nominal:<\/strong>\u00a0J<span class=\"qualifier\">ust<\/span> name of the categories, e.g. Marital Status, Gender,<\/p>\n<p>Color of M&amp;M\u2019s in a bag<\/p>\n<h3>Data Mining Architecture<\/h3>\n<p>We need to apply advanced techniques in the best way. As they must be <span class=\"adverb\">fully<\/span> integrated with a data business analysis tools. To operate data mining tools we need extra steps for the extracting, and importing the data.<\/p>\n<p>Furthermore, new insights need operational implementation, integration with the warehouse simplifies the application. We have to apply an analytic data warehouse to improve business processes. Particularly in areas such as promotional campaign management, and so on.<\/p>\n<div id=\"attachment_7155\" style=\"width: 438px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image-14-1.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7155\" class=\"wp-image-7155 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image-14-1.jpg\" alt=\"Data Mining Tutorial -\u00a0Data Mining Architecture\" width=\"428\" height=\"178\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image-14-1.jpg 428w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image-14-1-150x62.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image-14-1-300x125.jpg 300w\" sizes=\"auto, (max-width: 428px) 100vw, 428px\" \/><\/a><p id=\"caption-attachment-7155\" class=\"wp-caption-text\">Data Mining Tutorial &#8211;\u00a0Data Mining Architecture<\/p><\/div>\n<p>The ideal starting point is a data warehouse that must contain a combination of internal data tracking all customer contact. This should couple\u00a0with external market data about competitor activity. Background information on potential customers also provides an excellent basis for prospecting.<\/p>\n<p>An OLAP (On-Line Analytical Processing) server enables a more sophisticated end-user business model. That need to apply when navigating the data warehouse. Although, multidimensional structures allow the user to analyze the data. As they want to view their business. Such as summarizing by product line, region.<\/p>\n<p>Further, the Data Mining Server must <span class=\"passivevoice\">be integrated<\/span> with the data warehouse. And, the OLAP server to embed ROI-focused business analysis <span class=\"adverb\">directly<\/span> into this infrastructure. Also, integration with the data warehouse enables the operational decisions. That is to be implemented and tracked.<\/p>\n<p>Also, keep the warehouse grows with new decisions and results. Thus, the organization can mine the best practices and apply them to future decisions<\/p>\n<p>In the OLAP, results enhance the metadata. That is by providing a dynamic metadata layer. As this layer <span class=\"passivevoice\">is used<\/span> to represents a distilled view of the data. Reporting, visualization, and tools can then <span class=\"passivevoice\">be applied<\/span> to plan future actions. And confirm the impact of those plans.<\/p>\n<h3>Data Mining Process<\/h3>\n<p>Data Mining, also <span class=\"adverb\">popularly<\/span> known as <em><strong>Knowledge Discovery in Databases (KDD)<\/strong><\/em>). Also, nontrivial extraction of implicit information from data in databases.<\/p>\n<div id=\"attachment_7152\" style=\"width: 443px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image5.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-7152\" class=\"wp-image-7152 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image5.jpg\" alt=\"Data Mining Tutorial -\u00a0Data Mining Process\" width=\"433\" height=\"337\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image5.jpg 433w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image5-150x117.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/image5-300x233.jpg 300w\" sizes=\"auto, (max-width: 433px) 100vw, 433px\" \/><\/a><p id=\"caption-attachment-7152\" class=\"wp-caption-text\">Data Mining Tutorial &#8211;\u00a0Data Mining Process<\/p><\/div>\n<p>This Data Mining process comprises of a few steps. That is to lead from raw data collections to some form of new knowledge. The iterative process consists of the following steps:<\/p>\n<h4>a. Data Cleaning<\/h4>\n<p>This is the first step of data mining and is very important. In this phase noise data and irrelevant data <span class=\"passivevoice\">are removed<\/span> from the collection.<\/p>\n<h4>b. Data Integration<\/h4>\n<p>In this <span class=\"complexword\">multiple<\/span> data <span class=\"passivevoice\">is combined<\/span> at the same place. This enhances accuracy and speed of mining process. It is performed using migration tools like Oracle data service integrator and Microsoft SQL.<\/p>\n<h4>c. Data Selection<\/h4>\n<p>We have to decide the data relevant to the analysis <span class=\"passivevoice\">is decided<\/span> on and retrieved from the data collection.<\/p>\n<div id=\"attachment_34245\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-34245\" class=\"size-full wp-image-34245\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1.jpg\" alt=\"Data Mining Tutorial - Data Mining Process\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1-1024x536.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Data-Mining-Process-01-1-520x272.jpg 520w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-34245\" class=\"wp-caption-text\">Data Mining Tutorial &#8211; Data Mining Process<\/p><\/div>\n<h4>d. Data Transformation<\/h4>\n<p>It is also a data consolidation method. Also, it\u2019s a phase in which the selected data <span class=\"passivevoice\">is transformed<\/span> into forms. That is appropriate for the mining procedure.<\/p>\n<h4>e. Data Mining<\/h4>\n<p>In this, we have to apply clever techniques to extract patterns <span class=\"adverb\">potentially<\/span> useful.<\/p>\n<h4>f. Pattern Evaluation<\/h4>\n<p>In this process, interesting patterns representing knowledge <span class=\"passivevoice\">are identified<\/span> based on given measures.<\/p>\n<h4>g. Knowledge Representation<\/h4>\n<p>It is the final phase. Particularly in this phase, knowledge <span class=\"passivevoice\">is discovered<\/span> and represented to the user. This essential step uses visualization techniques. That help users understand and interpret the data mining results.<\/p>\n<h3>Categories of Data Mining Systems<\/h3>\n<p>As there are too many data mining systems available, but in this Data Mining Tutorial, we will study 4 major classifications. Also, some systems <span class=\"passivevoice\">are specific t<\/span>hat we need to dedicate to a given data source. Further, according to various criteria, data mining systems have to categorize.<\/p>\n<p><strong>a. Classification according to the type of data source mined<\/strong><\/p>\n<p>According to the type of data handle, have to perform classification of data mining. Such as spatial data, multimedia data, time-series data, text data, World Wide Web, etc.<\/p>\n<p><strong>b. Classification according to the data model drawn on<\/strong><\/p>\n<p>In this classification <span class=\"passivevoice\">is done<\/span> on the basis of a data model. Such as a relational database, object-oriented database, data warehouse, transactional, etc.<\/p>\n<p><strong>c. Classification according to the king of knowledge discovered<\/strong><\/p>\n<p>In this classification, it is been done on the basis of the kind of knowledge. Such as characterization, discrimination, association, classification, clustering, etc.<\/p>\n<p><strong>d. Classification according to mining techniques used<\/strong><\/p>\n<p>As data mining systems <span class=\"complexword\">employ<\/span> <span class=\"passivevoice\">are used<\/span> to provide different techniques. According to the data analysis, we have to do this classification. Such as machine learning, neural networks, genetic algorithms, \u00a0etc.<\/p>\n<h3>Data Mining Issues<\/h3>\n<p>In this part of the Data Mining Tutorial, we will discuss some major issues we faced in it.<\/p>\n<h4>a. Mining M<span class=\"complexword\">ethodology<\/span>\u00a0Issues<\/h4>\n<p>These issues to the data mining approach applied and their limitations such as the versatility of the mining approaches that can dictate mining <span class=\"complexword\">methodology<\/span> choices.<\/p>\n<h4>b. Performance Issues<\/h4>\n<p>As there is much artificial intelligence and statistical methods exist. That is use for data analysis. <span class=\"complexword\">However<\/span>, these methods were often not designed for the very large datasets. And data mining is dealing with today. As Terabyte sizes are common.<\/p>\n<p>We can say this raises the issues of scalability and efficiency of the data mining methods. That would process <span class=\"adverb\">considerably<\/span> large data. . Moreover, Linear algorithms are usually the norm. In the same theme, sampling can <span class=\"passivevoice\">be used<\/span> for mining instead of the whole dataset.<\/p>\n<p><span class=\"complexword\">However<\/span>, issues like completeness and choice of samples may arise. Other topics in the issue of performance are incremental updating and parallel programming. We use parallelism to solve the size problem. And if the dataset can <span class=\"passivevoice\">be subdivided<\/span> and the results can <span class=\"passivevoice\">be merged<\/span> later.<\/p>\n<p>Incremental updating is important for merging results from parallel mining. That the new data becomes available without having to re-analyze the complete dataset.<\/p>\n<h4>c. Data Source Issues<\/h4>\n<p>We must know that there are many issues related to the data sources. Some are practical such as the diversity of data types. While others are philosophical like the data glut problem.<\/p>\n<p>We <span class=\"adverb\">certainly<\/span> have an excess of data since. Also, we already have more data than we can handle. Then we are still collecting data at an even higher rate. Although, If the spread of database management systems.<\/p>\n<p>That has helped in increasing the gathering of information. And the advent of data mining is <span class=\"adverb\">certainly<\/span> encouraging more data harvesting. The current practice is to collect as much data as possible now and process it or try to process it, later.<\/p>\n<p><span class=\"complexword\">Regarding<\/span> the practical issues related to data sources, there is the subject databases. Thus, we need to focus on diverse complex data types. We are storing different types of data in a variety of repositories. It is difficult to expect a data mining system to achieve good mining results on all kinds of data and sources.<\/p>\n<p>As different kinds of data and sources may <span class=\"complexword\">require<\/span> distinct algorithms and methodologies. Currently, there is a focus on relational databases and data warehouses.<\/p>\n<p>It\u2019s a versatile data mining tool, for all sorts of data, may not be realistic. Moreover, data sources, at structural and semantic levels, poses important challenges. That is not only to the database community but also to the data mining community.<\/p>\n<h3>Data Mining Applications<\/h3>\n<ul>\n<li>Weather forecasting.<\/li>\n<\/ul>\n<ul>\n<li>E-commerce.<\/li>\n<\/ul>\n<ul>\n<li>Self-driving cars.<\/li>\n<\/ul>\n<ul>\n<li>Hazards of new medicine.<\/li>\n<\/ul>\n<ul>\n<li>Space research.<\/li>\n<\/ul>\n<ul>\n<li>Fraud detection.<\/li>\n<\/ul>\n<ul>\n<li>Stock trade analysis.<\/li>\n<\/ul>\n<ul>\n<li>Business forecasting.<\/li>\n<\/ul>\n<ul>\n<li>Social networks.<\/li>\n<\/ul>\n<ul>\n<li>Customers likelihood.<\/li>\n<\/ul>\n<p><strong>More applications include:<\/strong><\/p>\n<ul>\n<li>A credit card company can leverage its vast warehouse of customer transaction data. As we perform this to identify customers. It shows more <span class=\"passivevoice\">interest <\/span>in a new credit product.<\/li>\n<li>Moreover, we use small test mailing. So the attributes of customers with an affinity for the product have to identify. Recent projects have indicated more than a 20-fold decrease in costs. That is a target for mailing campaigns over conventional approaches.<\/li>\n<li>As a diversified transportation company used to apply data mining. That is to identify the best prospects for its services. Further, need to apply this segmentation to a general business database. Such as those provided by Dun &amp; Bradstreet can yield a prioritized list of prospects by region.<\/li>\n<li>Large consumer packaged goods company. That can apply data mining to improve its sales process to retailers. Although, data from consumer panel and competitor activity have to apply. That is to understand the reasons for brand and store switching.<\/li>\n<li>Through this analysis, we have to the manufacturer it. Then select promotional strategies that best reach their target customer segments.<\/li>\n<\/ul>\n<h3>Areas where Data Mining had Good and Bad Effects<\/h3>\n<p><strong>a. Good Effects<\/strong><\/p>\n<ul>\n<li>Predict future trends, customer <span class=\"complexword\">purchase<\/span> habits<\/li>\n<li>Help with decision making<\/li>\n<li>Improve company revenue and lower costs<\/li>\n<li>Market basket analysis<\/li>\n<li>Fraud detection<\/li>\n<\/ul>\n<p><strong>b. Bad Effects<\/strong><\/p>\n<ul>\n<li>User privacy\/security<\/li>\n<li>Amount of data is overwhelming<\/li>\n<li>Great cost at an implementation stage<\/li>\n<li>Possible misuse of information<\/li>\n<li>The possible inaccuracy of data<\/li>\n<\/ul>\n<h3>Data Mining Advantages and Disadvantages<\/h3>\n<h4>a. Data Mining Advantages<\/h4>\n<ul>\n<li>To find probable defaulters, we use data mining in banks and financial institutions. This <span class=\"passivevoice\">is done<\/span> based on past transactions, user behavior and data patterns.<\/li>\n<\/ul>\n<ul>\n<li>It helps advertisers to push right advertisements to the internet. That surfer on web pages based on machine learning algorithms. This way data mining benefit both possible buyers as well as sellers of the various products.<\/li>\n<\/ul>\n<ul>\n<li>The retail malls and grocery stores peoples used data mining. That is to arrange and keep most sellable items in the most attentive positions. It has become possible due to inputs obtained from data mining software. This way data mining helps in increasing revenue.<\/li>\n<\/ul>\n<ul>\n<li>As data mining <span class=\"passivevoice\">is having<\/span> different methods. That is cost-effective compared to other applications.<\/li>\n<\/ul>\n<ul>\n<li>We use data mining in so many areas. Such as bio-informatics, medicine, genetics, etc.<\/li>\n<\/ul>\n<ul>\n<li>We use data mining to identifying criminal suspects. That is by law enforcement agencies as mentioned above.<\/li>\n<\/ul>\n<h4>b. Data Mining Disadvantages<\/h4>\n<ul>\n<li>Security: The time at which users are online for various uses, must be important. They do not have security systems in place to protect us.<\/li>\n<\/ul>\n<ul>\n<li>As some of the data mining analytics use software. That is difficult to operate. Thus they <span class=\"complexword\">require<\/span> a user to have knowledge based training.<\/li>\n<\/ul>\n<ul>\n<li>The techniques of data mining are not 100% accurate. Hence, it may cause serious consequences in certain conditions.<\/li>\n<\/ul>\n<p>So, this was all about Data Mining Tutorial. Hope you like our explanation.<\/p>\n<h3>Conclusion<\/h3>\n<p>As a result, we have studied Data Mining introduction. Also, have studied about it&#8217;s all concepts. We have covered each and everything with pros-cons and applications. Furthermore, if you feel any query regarding Data Mining tutorial, feel free to ask in a comment section.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>In this Data Mining Tutorial, we will study\u00a0what is Data Mining. Also, will study data mining scope, foundation, data mining techniques and terminologies in Data Mining. As we study this, will learn data mining&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":34214,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18],"tags":[3337,3339,3349,3351,3361,16565,7015],"class_list":["post-7149","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-mining","tag-data-mining-advantages","tag-data-mining-applications","tag-data-mining-disadvantages","tag-data-mining-features","tag-data-mining-limitations","tag-data-mining-tutorial","tag-introduction-to-data-mining"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Data Mining Tutorial - Introduction to Data Mining (Complete Guide) - DataFlair<\/title>\n<meta name=\"description\" content=\"Data Mining Tutorial -Introduction to Data Mining,What is data mining,applications of data mining,advantages &amp; limitations of data mining,data Mining Issues\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Data Mining Tutorial - Introduction to Data Mining (Complete Guide) - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Data Mining Tutorial -Introduction to Data Mining,What is data mining,applications of data mining,advantages &amp; limitations of data mining,data Mining Issues\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-02-08T13:11:03+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2025-04-06T16:06:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"15 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Data Mining Tutorial - Introduction to Data Mining (Complete Guide) - DataFlair","description":"Data Mining Tutorial -Introduction to Data Mining,What is data mining,applications of data mining,advantages & limitations of data mining,data Mining Issues","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/","og_locale":"en_US","og_type":"article","og_title":"Data Mining Tutorial - Introduction to Data Mining (Complete Guide) - DataFlair","og_description":"Data Mining Tutorial -Introduction to Data Mining,What is data mining,applications of data mining,advantages & limitations of data mining,data Mining Issues","og_url":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-02-08T13:11:03+00:00","article_modified_time":"2025-04-06T16:06:42+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"15 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"Data Mining Tutorial &#8211; Introduction to Data Mining (Complete Guide)","datePublished":"2018-02-08T13:11:03+00:00","dateModified":"2025-04-06T16:06:42+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/"},"wordCount":3032,"commentCount":7,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg","keywords":["data mining advantages","data mining applications","data mining disadvantages","data mining features","data mining limitations","Data Mining Tutorial","Introduction to data mining"],"articleSection":["Data Mining Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/","url":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/","name":"Data Mining Tutorial - Introduction to Data Mining (Complete Guide) - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg","datePublished":"2018-02-08T13:11:03+00:00","dateModified":"2025-04-06T16:06:42+00:00","description":"Data Mining Tutorial -Introduction to Data Mining,What is data mining,applications of data mining,advantages & limitations of data mining,data Mining Issues","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Introduction-to-Data-Mining-01-1.jpg","width":1200,"height":628,"caption":"Data Mining Tutorial"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/data-mining-tutorial\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Data Mining Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/data-mining\/"},{"@type":"ListItem","position":3,"name":"Data Mining Tutorial &#8211; Introduction to Data Mining (Complete Guide)"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/7149","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=7149"}],"version-history":[{"count":13,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/7149\/revisions"}],"predecessor-version":[{"id":144762,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/7149\/revisions\/144762"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/34214"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=7149"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=7149"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=7149"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}