

{"id":9489,"date":"2018-02-27T13:28:05","date_gmt":"2018-02-27T13:28:05","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=9489"},"modified":"2021-05-28T13:01:53","modified_gmt":"2021-05-28T07:31:53","slug":"text-mining","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/text-mining\/","title":{"rendered":"What is Text Mining in Data Mining &#8211; Process &amp; Applications"},"content":{"rendered":"<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><span class=\"complexword\">1. Text Mining &#8211; Objective<\/span><\/h2>\n<div class=\"\">\n<p>Through this Text Mining Tutorial, we will learn what is Text Mining, a process of Text Mining,\u00a0Text Mining Applications, approaches, issues, areas, and Advantages and Disadvantages of Text Mining.<\/p>\n<\/div>\n<div id=\"attachment_9533\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9533\" class=\"wp-image-9533 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg\" alt=\"Text Mining in Data Mining - Concepts, Process &amp; Applications\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-9533\" class=\"wp-caption-text\">Text Mining in Data Mining &#8211; Concepts, Process &amp; Applications<\/p><\/div>\n<\/div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">2. What is Text Mining?<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><em>Text Mining is also known as Text Data Mining<\/em>. The purpose is too unstructured information, extract meaningful numeric indices from the text. Thus, make the information contained in the text accessible to the various algorithms. Information can <span class=\"passivevoice\">extracte<\/span>\u00a0to derive summaries contained in the documents. Hence, you can analyze words, clusters of words used in documents. In the most general terms, text mining will &#8220;turn text into numbers&#8221;. Such as predictive data mining projects, the application of unsupervised learning methods.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining\/\"><strong>Read more about Data Mining in detail<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">3. Areas of Text Mining in Data Mining<\/h2>\n<p>Following are the areas of text mining in Data Mining:<\/p>\n<div id=\"attachment_9534\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9534\" class=\"wp-image-9534 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01.jpg\" alt=\"Areas of Text Mining\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Areas-of-Text-Mining-01-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-9534\" class=\"wp-caption-text\">Areas of Text Mining in Data Mining<\/p><\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">a. Information Retrieval (IR)<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Information retrieval <span class=\"passivevoice\">is regarded<\/span> as an extension to document retrieval. That the documents that <span class=\"passivevoice\">are returned<\/span> <span class=\"passivevoice\">are processed<\/span> to condense. Thus document retrieval\u00a0<span class=\"passivevoice\">follow\u00a0by<\/span> a text summarization stage. That focuses on the query posed by the user. IR systems help in to narrow down the set of documents that are relevant to a particular problem. As text mining involves applying very complex algorithms to large document collections. Also, IR can speed up the analysis <span class=\"adverb\">significantly<\/span> by reducing the number of documents.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">b. Data Mining (DM)<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-applications\/\"><strong>Data mining<\/strong> <\/a>can\u00a0<span class=\"adverb\">loosely<\/span> describe\u00a0as looking for patterns in data. It can more characterize as the extraction of hidden from data. Data mining tools can predict behaviours and future trends. Also, it allows businesses to make positive, knowledge-based decisions. Data mining tools can answer business questions. Particularly that have <span class=\"adverb\">traditionally<\/span> been too time-consuming to resolve. They search databases for hidden and unknown patterns.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-tools-techniques\/\">Follow this link to know about Data Mining Tools<\/a><\/strong><\/div>\n<div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">c. Natural Language Processing (NLP)<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">NLP is one of the oldest and most challenging problems. It is the study of human language. So those computers can understand natural languages as humans do. NLP research pursues the vague question of how we understand the meaning of a sentence or a document. What are the indications we use to understand who did what to whom? The role of NLP in text mining is to deliver the system in the information extraction phase as an input.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">d. Information Extraction (IE)<\/h3>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Information Extraction is the task of <span class=\"adverb\">automatically<\/span> extracting structured information from unstructured. In most of the cases, this activity includes processing human language texts by means of NLP.<\/div>\n<\/div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">4. Text Mining Process<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">A process of Text mining involves a series of activities to <span class=\"passivevoice\">perform<\/span>\u00a0to mine the information. These activities are:<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">\n<div id=\"attachment_9535\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9535\" class=\"wp-image-9535 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining.jpg\" alt=\"Process of Text Mining\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Process-of-Text-Mining-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-9535\" class=\"wp-caption-text\">The process of Text Mining<\/p><\/div>\n<\/div>\n<div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">a. Text Pre-processing<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">It involves a series of steps as shown in below:<\/div>\n<\/div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Text Cleanup<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Text Cleanup means removing any unnecessary or unwanted information. Such as remove ads from web pages, normalize text converted from binary formats.<\/div>\n<\/div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Tokenization<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Tokenizing is <span class=\"adverb\">simply<\/span> achieved by splitting the text into white spaces.<\/div>\n<\/div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Part of Speech Tagging<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Part-of-Speech (POS) tagging means word class assignment to each token. Its input <span class=\"passivevoice\">is given by<\/span> the tokenized text. Taggers have to cope with unknown words (OOV problem) and ambiguous word-tag mappings.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">b. Text Transformation (Attribute Generation)<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">A text document <span class=\"passivevoice\">is represented by<\/span> the words it contains and their occurrences. Two main approaches to document representation are:<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">i. Bag of words<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">ii. Vector Space<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">c. Feature Selection (Attribute Selection)<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Feature selection also <span class=\"passivevoice\">is known<\/span> as variable selection. It is the process of selecting a subset of important features for use in model creation. Redundant features are the one which provides no extra information. Irrelevant features provide no useful or relevant information in any context.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">d. Data Mining<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">At this point, the Text mining process merges with the traditional process. Classic Data Mining techniques <span class=\"passivevoice\">are used<\/span> in the structured database. Also, it resulted from the previous stages.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><span class=\"complexword\">e. Evaluate<\/span><\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><span class=\"complexword\">Evaluate<\/span> the result, after evaluation, the result discard.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">f. Applications<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Text Mining\u00a0<span class=\"passivevoice\">applies<\/span>\u00a0in a variety of areas. Some of the most common areas are<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-process\/\">Read more about Data Mining Process in detail<\/a><\/strong><\/div>\n<div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Web Mining<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">These days web contains a treasure of information about subjects. Such as persons, companies, organizations, products, etc. that may be of wide interest. Web Mining is an application of data mining techniques. That need to discover hidden and unknown patterns from the Web. Web mining is an activity of identifying term implied in a large document collection. It says C which\u00a0<span class=\"passivevoice\">denotes by<\/span> a mapping i.e. C \u2192p [10].<\/div>\n<\/div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Medical<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Users exchange information with others about subjects of interest. Everyone wants to understand specific diseases, to <span class=\"passivevoice\">inform<\/span>\u00a0about new therapies. Also, these expert forums also represent seismographs for medical. E-mails, e-consultations, and requests for medical advice. That is via the internet have <span class=\"passivevoice\">been analyzed<\/span> using quantitative or qualitative methods.<\/div>\n<\/div>\n<div class=\"\">\n<ul>\n<li class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>Resume Filtering<\/strong><\/li>\n<\/ul>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Big enterprises and headhunters receive thousands of resumes from job applicants every day. Extracting information from resumes with high precision and recall is not easy. <span class=\"adverb\">Automatically<\/span> extracting this information can the first step in filtering resumes. Hence, automating the process of resume selection is an important task.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/advantages-of-data-mining\/\"><strong>Let&#8217;s look at Data Mining Advantages in detail<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">5. Approaches to Text Mining in Data Mining<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Using well-tested methods and understanding the results of text mining. Once a data matrix has <span class=\"passivevoice\">been computed<\/span> from the input documents. And words found in those documents, various well-known analytic techniques. AS it <span class=\"passivevoice\">is used<\/span> for further processing those data including methods for clustering.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">&#8220;Black-box&#8221; approaches to text mining and extraction of concepts. There are text mining applications which offer &#8220;black-box&#8221; methods. That need to extract &#8220;deep meaning&#8221; from documents with little human effort. These text mining applications rely on proprietary algorithms.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/disadvantages-of-data-mining\/\"><strong>Let&#8217;s read about Data Mining Disadvantages\u00a0<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">6. Numericizing Text<\/h2>\n<p>Following are issues and considerations for\u00a0Numericizing Text.<\/p>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>i. Large numbers of large documents<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Examples of scenarios using large numbers of small <span class=\"passivevoice\">were given<\/span> earlier. But, if your intent is to extract &#8220;concepts&#8221; from only a few documents that are very large. Then analyses are less powerful because the &#8220;number of cases&#8221; in this case is very small. While the &#8220;number of variables&#8221; (extracted words) is very large.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>ii. Excluding certain characters, short words, numbers, etc<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Excluding numbers, certain characters can <span class=\"passivevoice\">be done<\/span> <span class=\"adverb\">easily<\/span>. But before the indexing of the input documents starts. You may also want to exclude &#8220;rare words,&#8221;. As defined as those that only occur in a small percentage of the processed documents.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>iii. Include lists, exclude lists (stop-words)<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">This is useful when you want to search for particular words. Also, classifying the input documents based on the frequencies. Also, &#8220;stop-words,&#8221; i.e., terms that are to <span class=\"passivevoice\">exclude<\/span>\u00a0from the indexing can <span class=\"passivevoice\">define<\/span>. <span class=\"adverb\">Typically<\/span>, a default list of English stop words includes &#8220;the&#8221;, &#8220;a&#8221;, &#8220;of&#8221;, &#8220;since,&#8221;. That is words that <span class=\"passivevoice\">are used<\/span> in the respective language very <span class=\"adverb\">frequently<\/span>. But communicate very little unique information about the contents of the document.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>iv. Synonyms and phrases<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Synonyms, such as &#8220;sick&#8221; or &#8220;ill&#8221;, or words that <span class=\"passivevoice\">are used<\/span> in particular phrases. Where they denote unique meaning and can<span class=\"passivevoice\">\u00a0combine<\/span>\u00a0for indexing.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>For example-<\/strong><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">&#8220;Microsoft Windows&#8221; might be such a phrase. That is a specific reference to the computer operating system. But has nothing to do with the common use of the term &#8220;Windows&#8221;. As it might, for example, <span class=\"passivevoice\">use<\/span>\u00a0in descriptions of home improvement projects.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>v. Stemming algorithms<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">An important pre-processing step before indexing of input documents. As it begins is the stemming of words. The term &#8220;stemming&#8221; refers to the reduction of words to their roots. So that, for example, different grammatical forms.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>vi Support for different languages<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Stemming, synonyms, the letters that <span class=\"passivevoice\">are permitted<\/span> in words. Also, are <span class=\"adverb\">highly<\/span> language dependent operations. <span class=\"complexword\">Therefore<\/span>, support for different languages is important.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-architecture\/\"><strong>Let&#8217;s discuss Data Mining Architecture in brief<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">7. Incorporating Text Mining Results<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">\n<p class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Incorporating Text Mining Results in Data Mining Projects, after significant words have <span class=\"passivevoice\">been extracted<\/span> from a set of input documents. And after singular value decomposition has <span class=\"passivevoice\">been applied<\/span> to extract salient semantic dimensions. <span class=\"adverb\">Typically<\/span> the next and most important step is to use the extracted information.<\/p>\n<\/div>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>a. Graphics (visual data mining methods)<\/strong><\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Depending on the purpose of the analyses, in some instances. We need extraction of semantic dimensions alone. As it can be a useful outcome if it clarifies the underlying structure.<\/div>\n<\/div>\n<div class=\"\">\n<p class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>b. Clustering and factoring<\/strong><\/p>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">You can use cluster analysis methods to identify groups of documents. Also, to identify groups of similar input texts. This type of analysis also useful in the context of market research studies.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">For example- of new car owners. You can also use Factor Analysis and Principal Components and Classification Analysis.<\/div>\n<\/div>\n<div class=\"\">\n<p class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>c. Predictive data mining<\/strong><\/p>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Another possibility is to use the raw as predictor variables in mining projects.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-techniques\/\"><strong>Do know about Data Mining Techniques, follow this link to explore<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">8.\u00a0Text Mining Applications<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Unstructured text is very common. And may represent the majority of information available to a particular research.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">\n<div id=\"attachment_9536\" style=\"width: 1210px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications.jpg\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-9536\" class=\"wp-image-9536 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications.jpg\" alt=\"Text Mining Application\" width=\"1200\" height=\"628\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications-150x79.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications-300x157.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications-768x402.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-Applications-1024x536.jpg 1024w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><p id=\"caption-attachment-9536\" class=\"wp-caption-text\">Text Mining Application<\/p><\/div>\n<\/div>\n<div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>a. Analyzing open-ended survey responses<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">In survey research, it is not uncommon to include various open-ended questions. That is pertaining<span class=\"complexword\"> to<\/span> the topic under investigation. The idea is to permit respondents to express their &#8220;views&#8221;. Also, opinions without constraining them to particular dimensions or a particular response format.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>b. Automatic processing of messages, emails, etc<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Another common application is to aid in the automatic classification of texts.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>For example-<\/strong><\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">It is possible to &#8220;filter&#8221; out <span class=\"adverb\">automatically<\/span> most undesirable &#8220;junk email&#8221;. That <span class=\"passivevoice\">is based<\/span> on certain terms or words that are not likely to appear in legitimate messages. Although, instead identify undesirable electronic mail. In this manner, such messages can <span class=\"adverb\">automatically<\/span><span class=\"passivevoice\">\u00a0discard<\/span>. Such automatic systems for classifying electronic messages can also be useful in applications. That messages need to <span class=\"passivevoice\">route<\/span>\u00a0to the most appropriate department. At the same time, the emails <span class=\"passivevoice\">are screened<\/span> for inappropriate or obscene messages. <span class=\"hardreadability\">That are <\/span><span class=\"adverb\">automatically<\/span><span class=\"hardreadability\"> returned to the sender with a request to remove the offending words or content<\/span>.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>c. Analyzing warranty or insurance claims, diagnostic interviews, etc<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">In some business domains, the majority of information <span class=\"passivevoice\">is collected<\/span> in open-ended.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>For example-<\/strong><\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Warranty claims or initial medical interviews can<span class=\"passivevoice\">\u00a0summarize<\/span>\u00a0in brief narratives. <span class=\"adverb\">Increasingly<\/span>, those notes <span class=\"passivevoice\">are collected<\/span> <span class=\"adverb\">electronically<\/span>. So those types of narratives are <span class=\"adverb\">readily<\/span> available for input. This information can then\u00a0<span class=\"adverb\">usefully<\/span> exploit to, Likewise, in the medical field. Also, open-ended descriptions by patients of their own symptoms. That might yield useful clues for the actual medical diagnosis.<\/div>\n<\/div>\n<div class=\"\">\n<h4 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>d. Investigating competitors by crawling their websites<\/strong><\/h4>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Another type of application is to process the contents of Web pages in a particular domain.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><strong>For example-<\/strong><\/div>\n<\/div>\n<div class=\"\">\n<div><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">You could go to a Web page, and begin &#8220;crawling&#8221; the links you find there to process all Web pages that <span class=\"passivevoice\">are referenced<\/span>. In this manner, you could derive a list of terms and documents available at that site. Hence determine the most important terms and features that <span class=\"passivevoice\">are described<\/span>.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-and-knowledge-discovery\/\"><strong>Know about Data Mining And Knowledge Discovery Database<\/strong><\/a><\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">9.\u00a0Advantages &amp; Disadvantages of Text Mining<\/h2>\n<p>Following are the pros and cons of Text Mining in Data Mining:<\/p>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">a. Advantages\u00a0of Text Mining<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Web mining <span class=\"adverb\">essentially<\/span> has many advantages. That make this technology attractive to corporations including the government agencies. This technology has enabled e-commerce to do personalized marketing. As it includes <span class=\"adverb\">eventually<\/span> results in higher trade volumes. The government agencies are using this technology to classify threats. The predicting capability can benefit the society by identifying criminal activities. The companies can establish a better customer relationship. Exactly by giving them exactly what they need. Companies can understand the needs of the customer better. Further, they can react to customer needs faster.<\/div>\n<\/div>\n<div class=\"\">\n<h3 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">b. Disadvantages of Text Mining<\/h3>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Web mining the technology itself doesn\u2019t create issues. Although, this technology when used on data of personal nature might cause concerns.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">The most criticized ethical issue involving web mining is the invasion of privacy. Privacy <span class=\"passivevoice\">is considered<\/span> lost when information <span class=\"complexword\">concerning<\/span> an individual <span class=\"passivevoice\">is obtained<\/span>. The obtained data will <span class=\"passivevoice\">analyze<\/span>, and clustered to form profiles. Also, the data will make anonymous before clustering. So that no individual can\u00a0<span class=\"passivevoice\">link<\/span>\u00a0<span class=\"adverb\">directly<\/span> to a profile. But usually, the group profiles <span class=\"passivevoice\">are used<\/span> as if they are personal profiles. Thus these applications de-individualize the users by judging them by their mouse clicks.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">De-individualization can <span class=\"passivevoice\">define<\/span>\u00a0as a tendency of judging and treating people. Particularly, on the basis of group characteristics.<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">Another important concern is that the companies collecting the data. That is for a specific purpose might use the data for a <span class=\"adverb\">totally<\/span> different purpose. And this <span class=\"adverb\">essentially<\/span> violates the user\u2019s interests. The growing trend of selling personal data as a commodity encourages website owners. That is to trade personal data obtained from their site. This trend has increased the amount of data <span class=\"passivevoice\">being captured<\/span>. Also, traded increasing the likeliness of one\u2019s privacy <span class=\"passivevoice\">being invaded<\/span>.<\/div>\n<\/div>\n<\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\"><\/div>\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\" style=\"text-align: left\">Let&#8217;s revise <a href=\"https:\/\/data-flair.training\/blogs\/classification-algorithms\/\"><strong>Data Mining Classification algorithms<\/strong><\/a> &amp; <a href=\"https:\/\/data-flair.training\/blogs\/cluster-analysis-data-mining\/\"><strong>Cluster analysis<\/strong><\/a><\/div>\n<div><\/div>\n<div>So, this was all about Text Mining in data Mining. Hope you like our explanation.<\/div>\n<div>\n<div class=\"\">\n<h2 class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">10. Conclusion<\/h2>\n<\/div>\n<div class=\"\">\n<div class=\"public-DraftStyleDefault-block public-DraftStyleDefault-ltr\">As a result, we have studied what is Text Mining. Also, have learned a process, approaches along with applications and pros and cons of Text Mining. I hope this blog will help you to understand Text Mining. Furthermore, if you have any query, feel free to ask in a comment section.<\/div>\n<\/div>\n<\/div>\n<div class=\"\"><\/div>\n<div class=\"\">See Also-<\/div>\n<div class=\"\"><\/div>\n<div class=\"\"><a href=\"https:\/\/data-flair.training\/blogs\/data-mining-terminologies\/\"><strong>Terminologies used in Data Mining\u00a0\u00a0<\/strong><\/a><\/div>\n<div class=\"\"><\/div>\n<div class=\"\"><strong><a href=\"https:\/\/data-flair.training\/blogs\/interview-question-for-data-mining\/\">Mostly asked Interview Questions for Data Mining<\/a><\/strong><\/div>\n<div>\n<div class=\"\"><\/div>\n<div class=\"\"><strong><a href=\"https:\/\/en.wikipedia.org\/wiki\/Text_mining\">For reference<\/a><\/strong><\/div>\n<\/div>\n<p><span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:2019,&quot;href&quot;:&quot;https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Text_mining&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20250916092028\\\/https:\\\/\\\/en.wikipedia.org\\\/wiki\\\/Text_mining&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-10 21:00:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-25 18:19:07&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-30 21:19:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-04 23:18:53&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-12 06:54:40&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-23 07:40:23&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-27 04:24:42&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-30 08:20:38&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-07 03:55:45&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-15 20:35:43&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-27 13:46:46&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-04 07:15:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-11 13:12:22&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-16 03:28:30&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-21 14:48:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-26 03:45:31&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-30 23:26:47&quot;,&quot;http_code&quot;:429},{&quot;date&quot;:&quot;2026-04-10 14:29:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-14 13:08:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-21 06:07:50&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-25 04:08:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-04 21:02:58&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-07 21:53:39&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-11 20:14:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-22 04:47:40&quot;,&quot;http_code&quot;:429},{&quot;date&quot;:&quot;2026-05-26 18:41:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-16 11:57:57&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-22 19:37:18&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-22 19:37:18&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>1. Text Mining &#8211; Objective Through this Text Mining Tutorial, we will learn what is Text Mining, a process of Text Mining,\u00a0Text Mining Applications, approaches, issues, areas, and Advantages and Disadvantages of Text Mining.&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":9533,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[18],"tags":[6702,6703,7157,9010,10150,14650,14654,14655,14656,14657,14759,15165,16017],"class_list":["post-9489","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-data-mining","tag-information-extraction-ie","tag-information-retrieval-ir","tag-introduction-to-text-mining","tag-natural-language-processing-nlp","tag-process-and-applications","tag-text-cleanup","tag-text-mining","tag-text-mining-applications","tag-text-mining-process","tag-text-pre-processing","tag-tokenization","tag-unstructred-data","tag-what-is-text-mining"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>What is Text Mining in Data Mining - Process &amp; Applications - DataFlair<\/title>\n<meta name=\"description\" content=\"Text Mining Process,areas, Approaches, Text Mining application, Numericizing Text, Advantages &amp; Disadvantages of text mining in data mining,text data mining\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/text-mining\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"What is Text Mining in Data Mining - Process &amp; Applications - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Text Mining Process,areas, Approaches, Text Mining application, Numericizing Text, Advantages &amp; Disadvantages of text mining in data mining,text data mining\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/text-mining\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-02-27T13:28:05+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-05-28T07:31:53+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"11 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"What is Text Mining in Data Mining - Process &amp; Applications - DataFlair","description":"Text Mining Process,areas, Approaches, Text Mining application, Numericizing Text, Advantages & Disadvantages of text mining in data mining,text data mining","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/text-mining\/","og_locale":"en_US","og_type":"article","og_title":"What is Text Mining in Data Mining - Process &amp; Applications - DataFlair","og_description":"Text Mining Process,areas, Approaches, Text Mining application, Numericizing Text, Advantages & Disadvantages of text mining in data mining,text data mining","og_url":"https:\/\/data-flair.training\/blogs\/text-mining\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-02-27T13:28:05+00:00","article_modified_time":"2021-05-28T07:31:53+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"11 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"What is Text Mining in Data Mining &#8211; Process &amp; Applications","datePublished":"2018-02-27T13:28:05+00:00","dateModified":"2021-05-28T07:31:53+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/"},"wordCount":2276,"commentCount":3,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg","keywords":["Information Extraction (IE)","Information Retrieval (IR)","Introduction to Text Mining","Natural Language Processing (NLP)","process and applications","Text Cleanup","Text mining","Text Mining Applications","Text Mining Process","Text Pre-processing","Tokenization","unstructred data","what is text mining"],"articleSection":["Data Mining Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/text-mining\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/text-mining\/","url":"https:\/\/data-flair.training\/blogs\/text-mining\/","name":"What is Text Mining in Data Mining - Process &amp; Applications - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg","datePublished":"2018-02-27T13:28:05+00:00","dateModified":"2021-05-28T07:31:53+00:00","description":"Text Mining Process,areas, Approaches, Text Mining application, Numericizing Text, Advantages & Disadvantages of text mining in data mining,text data mining","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/text-mining\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/02\/Text-Mining-01.jpg","width":1200,"height":628,"caption":"Introduction to Text Mining"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/text-mining\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Data Mining Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/data-mining\/"},{"@type":"ListItem","position":3,"name":"What is Text Mining in Data Mining &#8211; Process &amp; Applications"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9489","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=9489"}],"version-history":[{"count":5,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9489\/revisions"}],"predecessor-version":[{"id":35431,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/9489\/revisions\/35431"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/9533"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=9489"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=9489"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=9489"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}