{"id":118013,"date":"2024-11-04T18:00:12","date_gmt":"2024-11-04T12:30:12","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=118013"},"modified":"2026-06-01T14:34:49","modified_gmt":"2026-06-01T09:04:49","slug":"sms-spam-detection-using-machine-learning","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/","title":{"rendered":"Machine Learning Project &#8211; SMS Spam Detection"},"content":{"rendered":"<p>In the era of digital communication, text messages have become a fundamental part of our lives, offering various opportunities for interaction. To address the disruptive and security risks of spam messages, our &#8220;SMS Spam Detection using Machine Learning&#8221; project leverages the power of machine learning.<\/p>\n<p>By employing the Naive Bayes classifier and TF-IDF vectorization, we create a robust spam detection model that accurately distinguishes between genuine messages (ham) and harmful messages (spam). Our project involves visualizing label distribution, analyzing word statistics, and evaluating the model&#8217;s performance using various metrics.<\/p>\n<p>Through this effort, we aim to provide a valuable tool to combat unwanted messages and enhance the efficiency and security of digital conversations, shielding users from SMS spam intrusion.<\/p>\n<h3>About SMS Spam Dataset<\/h3>\n<p>The SMS Spam Collection comprises a <a href=\"https:\/\/www.kaggle.com\/datasets\/uciml\/sms-spam-collection-dataset\">dataset<\/a> of 5,574 English SMS messages classified as either &#8220;ham&#8221; (legitimate) or &#8220;spam,&#8221; gathered for SMS Spam research. Each file contains one message per line, with two columns: &#8220;v1&#8221; indicating the label (ham or spam) and &#8220;v2&#8221; containing the raw text.<\/p>\n<p>The corpus was sourced from multiple sources on the Internet, including 425 SMS spam messages extracted from the Grumbletext website, which involved the challenging task of identifying spam claims from web pages. Additionally, a subset of 3,375 randomly selected ham messages from the NUS SMS Corpus and 450 SMS ham messages from Caroline Tag&#8217;s PhD Thesis were incorporated.<\/p>\n<p>Furthermore, the corpus includes the SMS Spam Corpus v.0.1 Big, comprising 1,002 ham messages and 322 spam messages, utilized in academic research on SMS spam detection. This comprehensive dataset is valuable for studying and developing effective SMS spam detection models.<\/p>\n<h3>Tools and libraries used<\/h3>\n<p><strong>1. NumPy (import numpy as np):<\/strong> NumPy is a powerful library for numerical computing in Python. It supports large, multi-dimensional arrays and matrices and a wide range of mathematical functions to efficiently operate on them.<\/p>\n<p><strong>2. pandas (import pandas as pd):<\/strong> pandas is a popular library for data manipulation and analysis in Python. It provides data structures like DataFrame, making it easy to handle and analyze structured data.<\/p>\n<p><strong>3. Matplotlib (import matplotlib.pyplot as plt):<\/strong> Matplotlib is a widely-used library for creating visualizations and plots in Python. It offers a variety of plotting functions and customization options to visualize data effectively.<\/p>\n<p><strong>4. seaborn (import seaborn as sns):<\/strong> seaborn is a data visualization library built on Matplotlib. It provides a high-level interface for creating attractive and informative statistical graphics.<\/p>\n<p><strong>5. scikit-learn (from sklearn&#8230;):<\/strong> scikit-learn is a comprehensive machine learning library in Python. It offers various algorithms and tools for model selection, data preprocessing, and evaluation.<\/p>\n<h3>Download Machine Learning SMS Spam Detection Project<\/h3>\n<p>Please download the Machine Learning SMS Spam Detection Project source code from the following link: <a href=\"https:\/\/drive.google.com\/file\/d\/1KHlSHvQh0RViIFSLsO53fMTLilW-auAr\/view?usp=drive_link\"><strong>Machine Learning SMS Spam Detection Project Code.<\/strong><\/a><\/p>\n<h3>Steps to Detect Spams in SMS Using Machine Learning<\/h3>\n<h4>Step 1: Reading and Preprocessing the Data<\/h4>\n<p>We start by reading the SMS Spam data from a CSV file using pandas&#8217; `read_csv` function. Then, we drop unnecessary columns (&#8216;Unnamed: 2&#8217;, &#8216;Unnamed: 3&#8217;, &#8216;Unnamed: 4&#8217;) using the `drop` method. The &#8216;v1&#8217; and &#8216;v2&#8217; columns are renamed to &#8216;label&#8217; and &#8216;text&#8217;, respectively, for better understanding. We create a new column &#8216;label_enc&#8217; to map the &#8216;ham&#8217; and &#8216;spam&#8217; labels to numerical values (0 and 1).<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">import numpy as np\r\nimport pandas as pd\r\n# Reading the data\r\ndata = pd.read_csv(r\"C:\\Users\\vaish\\Downloads\\SMS Spam Detection\\spam.csv\", encoding='latin-1')\r\ndata = data.drop(['Unnamed: 2', 'Unnamed: 3', 'Unnamed: 4'], axis=1)\r\ndata = data.rename(columns={'v1': 'label', 'v2': 'text'})\r\ndata['label_enc'] = data['label'].map({'ham': 0, 'spam': 1})<\/pre>\n<h4>Step 2: Visualizing Label Distribution<\/h4>\n<p>To gain insights into the balance of ham and spam messages in the dataset, we visualize the distribution using seaborn&#8217;s `countplot` and matplotlib.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">import matplotlib.pyplot as plt\r\nimport seaborn as sns\r\n# Visualizing label distribution\r\nsns.countplot(x=data['label'])\r\n\r\nplt.xlabel('Label')\r\nplt.ylabel('Count')\r\nplt.title('Distribution of Labels')\r\nplt.show()<\/pre>\n<p><strong>Output<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/label-distribution.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-143549 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/label-distribution.webp\" alt=\"label distribution\" width=\"1920\" height=\"850\" \/><\/a><\/p>\n<h4>Step 3: Calculating Average Number of Tokens and Total Unique Words<\/h4>\n<p>We calculate the average number of tokens (words) in all sentences and the total number of unique words in the corpus.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\"># Average number of tokens in all sentences\r\navg_words_len = round(sum([len(text.split()) for text in data['text']]) \/ len(data['text']))\r\nprint(\"Average number of words per sentence:\", avg_words_len)\r\n# Total number of unique words in corpus\r\nunique_words = set()\r\nfor text in data['text']:\r\n    for word in text.split():\r\n        unique_words.add(word)\r\ntotal_words_length = len(unique_words)\r\nprint(\"Total number of unique words in corpus:\", total_words_length)<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/average-and-total-tokens.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-143550 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/average-and-total-tokens.webp\" alt=\"average and total tokens\" width=\"1920\" height=\"855\" \/><\/a><\/p>\n<h4>Step 4: Splitting Data into Training and Testing Sets<\/h4>\n<p>We split the data into training and testing sets using scikit-learn&#8217;s `train_test_split` function. The data is divided into an 80-20 ratio, with 80% for training and 20% for testing.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.model_selection import train_test_split\r\n# Splitting data into training and testing sets\r\nX, y = np.asarray(data['text']), np.asarray(data['label_enc'])\r\nX_train, X_test, y_train, y_test = train_test_split(\r\n    X, y, test_size=0.2, random_state=42)\r\n\r\nprint(\"Training set shape:\", X_train.shape, y_train.shape)\r\nprint(\"Testing set shape:\", X_test.shape, y_test.shape)<\/pre>\n<h4>Step 5: Creating TF-IDF Vectors<\/h4>\n<p>We create TF-IDF vectors for the training and testing text data using scikit-learn&#8217;s `TfidfVectorizer`. TF-IDF (Term Frequency-Inverse Document Frequency) is a numerical representation of text data, capturing the importance of words within the documents.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.feature_extraction.text import TfidfVectorizer\r\n# Creating TF-IDF vectors\r\ntv = TfidfVectorizer().fit(X_train)\r\nX_train_tv, X_test_tv = tv.transform(X_train), tv.transform(X_test)<\/pre>\n<h4>Step 6: Training a Baseline Model (Naive Bayes)<\/h4>\n<p>We train a baseline model using the Naive Bayes classifier, implemented by scikit-learn&#8217;s `MultinomialNB`.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.naive_bayes import MultinomialNB\r\n# Training a baseline model (Naive Bayes)\r\nmodel = MultinomialNB()\r\nmodel.fit(X_train_tv, y_train)<\/pre>\n<h4>Step 7: Evaluating the Baseline Model<\/h4>\n<p>We evaluate the performance of the trained model using various metrics, including accuracy, precision, recall, and F1-score, provided by scikit-learn&#8217;s `classification_report` and `accuracy_score`.<br \/>\n<strong>Accuracy is 96%<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.metrics import classification_report, accuracy_score\r\n# Evaluating the baseline model\r\naccuracy = accuracy_score(y_test, model.predict(X_test_tv))\r\nprint(\"Accuracy of the model:\", accuracy)\r\nprint(classification_report(y_test, model.predict(X_test_tv)))<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/classification-report.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-143551 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/classification-report.webp\" alt=\"classification report\" width=\"1920\" height=\"844\" \/><\/a><\/p>\n<h4>Step 8: Plotting the Confusion Matrix<\/h4>\n<p>To visually analyze the model&#8217;s performance, we construct a confusion matrix heatmap using Seaborn&#8217;s heatmap.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.metrics import confusion_matrix\r\n# Plotting the confusion matrix\r\ny_pred = model.predict(X_test_tv)\r\ncm = confusion_matrix(y_test, y_pred)\r\nplt.figure(figsize=(8, 6))\r\nsns.heatmap(cm, annot=True, fmt=\"d\", cmap=\"Blues\")\r\nplt.xlabel(\"Predicted Label\")\r\nplt.ylabel(\"Actual Label\")\r\nplt.title(\"Confusion Matrix\")\r\nplt.show()<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/confusion-matrix.webp\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter wp-image-143552 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2024\/11\/confusion-matrix.webp\" alt=\"confusion matrix\" width=\"1920\" height=\"853\" \/><\/a><\/p>\n<h3>Summary<\/h3>\n<p>In conclusion, our &#8220;SMS Spam Detection Using Machine Learning&#8221; project successfully implements an efficient spam detection model, leveraging the Naive Bayes classifier and TF-IDF vectorization for high accuracy. We preprocess the data, visualize label distribution, and calculate word statistics for valuable insights. The model achieves commendable accuracy, precision, recall, and F1-score results. Using a confusion matrix heatmap, we visually represent the model&#8217;s performance in classifying ham and spam messages accurately.<\/p>\n<p>This project eliminates unwanted spam messages, ensuring enhanced efficiency and security in digital communications. By employing machine learning techniques, we contribute to a safer SMS experience, shielding users from spam intrusion and fostering a trustworthy messaging environment as communication evolves.<\/p>\n<p>You can check out more such machine learning projects on DataFlair.<span hidden class=\"__iawmlf-post-loop-links\" data-iawmlf-links=\"[{&quot;id&quot;:102,&quot;href&quot;:&quot;https:\\\/\\\/www.kaggle.com\\\/datasets\\\/uciml\\\/sms-spam-collection-dataset&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20251007022554\\\/https:\\\/\\\/www.kaggle.com\\\/datasets\\\/uciml\\\/sms-spam-collection-dataset&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-05 20:58:24&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-10 01:13:44&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-14 00:14:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-17 06:32:36&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-20 07:34:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-23 11:26:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-27 04:22:38&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-30 05:15:01&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-04 04:35:00&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-07 05:57:06&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-11 13:53:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-21 03:00:03&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-25 07:16:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-01 11:12:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-05 06:58:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-10 04:08:24&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-15 17:47:31&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-20 10:50:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-24 06:38:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-27 09:40:41&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-02 15:10:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-06 12:49:08&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-10 13:39:52&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-14 06:02:28&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-21 15:54:58&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-26 06:29:46&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-09 07:39:50&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-12 16:44:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-16 17:25:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-19 17:35:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-23 06:36:22&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-28 16:29:34&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-03 19:03:30&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-07 20:55:47&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-11 15:27:52&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-15 04:18:04&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-20 10:28:11&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-26 12:30:07&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-29 18:40:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-01 23:46:11&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-09 05:10:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-12 14:11:30&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-17 08:39:05&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-21 00:44:05&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-21 00:44:05&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:2635,&quot;href&quot;:&quot;https:\\\/\\\/drive.google.com\\\/file\\\/d\\\/1KHlSHvQh0RViIFSLsO53fMTLilW-auAr\\\/view?usp=drive_link&quot;,&quot;archived_href&quot;:&quot;&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[],&quot;broken&quot;:false,&quot;last_checked&quot;:null,&quot;process&quot;:&quot;done&quot;}]\"><\/span><\/p>\n","protected":false},"excerpt":{"rendered":"<p>In the era of digital communication, text messages have become a fundamental part of our lives, offering various opportunities for interaction. To address the disruptive and security risks of spam messages, our &#8220;SMS Spam&#46;&#46;&#46;<\/p>\n","protected":false},"author":86671,"featured_media":132918,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[21459,20614,30279,30281,30280,22490,30284,30283,30282],"class_list":["post-118013","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-machine-learning-project-ideas","tag-machine-learning-projects","tag-machine-learning-projects-for-practice","tag-machine-learning-sms-spam-detection","tag-machine-learning-sms-spam-detection-project","tag-ml-projects","tag-sms-spam-detection","tag-sms-spam-detection-project","tag-sms-spam-detection-using-machine-learning"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Machine Learning Project - SMS Spam Detection - DataFlair<\/title>\n<meta name=\"description\" content=\"Machine Learning SMS Spam Detection project implements an efficient spam detection model and TF-IDF vectorization for high accuracy.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Machine Learning Project - SMS Spam Detection - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Machine Learning SMS Spam Detection project implements an efficient spam detection model and TF-IDF vectorization for high accuracy.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2024-11-04T12:30:12+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-01T09:04:49+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/webp\" \/>\n<meta name=\"author\" content=\"TechVidvan Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"TechVidvan Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"5 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Machine Learning Project - SMS Spam Detection - DataFlair","description":"Machine Learning SMS Spam Detection project implements an efficient spam detection model and TF-IDF vectorization for high accuracy.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Machine Learning Project - SMS Spam Detection - DataFlair","og_description":"Machine Learning SMS Spam Detection project implements an efficient spam detection model and TF-IDF vectorization for high accuracy.","og_url":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2024-11-04T12:30:12+00:00","article_modified_time":"2026-06-01T09:04:49+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp","type":"image\/webp"}],"author":"TechVidvan Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"TechVidvan Team","Est. reading time":"5 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/"},"author":{"name":"TechVidvan Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/0e594f928e31fc96628ac40f6ae74f49"},"headline":"Machine Learning Project &#8211; SMS Spam Detection","datePublished":"2024-11-04T12:30:12+00:00","dateModified":"2026-06-01T09:04:49+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/"},"wordCount":871,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp","keywords":["Machine Learning Project Ideas","machine learning projects","machine learning projects for practice","machine learning sms spam detection","machine learning sms spam detection project","ml projects","sms spam detection","sms spam detection project","sms spam detection using machine learning"],"articleSection":["Machine Learning Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/","url":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/","name":"Machine Learning Project - SMS Spam Detection - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp","datePublished":"2024-11-04T12:30:12+00:00","dateModified":"2026-06-01T09:04:49+00:00","description":"Machine Learning SMS Spam Detection project implements an efficient spam detection model and TF-IDF vectorization for high accuracy.","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2023\/12\/machine-learning-sms-spam-detection-project.webp","width":1200,"height":628,"caption":"machine learning sms spam detection project"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/sms-spam-detection-using-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Machine Learning Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/machine-learning\/"},{"@type":"ListItem","position":3,"name":"Machine Learning Project &#8211; SMS Spam Detection"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/0e594f928e31fc96628ac40f6ae74f49","name":"TechVidvan Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c89190da3d4010c71ba476b618ab10fdc2335c82cdfa0ad5002d98d0f2473444?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c89190da3d4010c71ba476b618ab10fdc2335c82cdfa0ad5002d98d0f2473444?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c89190da3d4010c71ba476b618ab10fdc2335c82cdfa0ad5002d98d0f2473444?s=96&d=mm&r=g","caption":"TechVidvan Team"},"description":"TechVidvan Team provides high-quality content &amp; courses on AI, ML, Data Science, Data Engineering, Data Analytics, programming, Python, DSA, Android, Flutter, full stack web dev, MERN, and many latest technology.","url":"https:\/\/data-flair.training\/blogs\/author\/test001\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/118013","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/86671"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=118013"}],"version-history":[{"count":11,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/118013\/revisions"}],"predecessor-version":[{"id":148727,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/118013\/revisions\/148727"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/132918"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=118013"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=118013"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=118013"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}