

{"id":95670,"date":"2021-05-21T09:30:02","date_gmt":"2021-05-21T04:00:02","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=95670"},"modified":"2026-06-01T14:37:42","modified_gmt":"2026-06-01T09:07:42","slug":"credit-card-fraud-detection-python-machine-learning","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/","title":{"rendered":"Credit Card Fraud Detection with Python &amp; Machine Learning"},"content":{"rendered":"<div class='__iawmlf-post-loop-links' style='display:none;' data-iawmlf-post-links='[{&quot;id&quot;:372,&quot;href&quot;:&quot;https:\\\/\\\/www.kaggle.com\\\/mlg-ulb\\\/creditcardfraud&quot;,&quot;archived_href&quot;:&quot;http:\\\/\\\/web-wp.archive.org\\\/web\\\/20230522220603\\\/https:\\\/\\\/www.kaggle.com\\\/mlg-ulb\\\/creditcardfraud&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[{&quot;date&quot;:&quot;2025-12-08 09:37:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-11 10:32:13&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-14 12:53:30&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-17 13:54:06&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-20 14:56:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-23 15:42:40&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-26 17:02:33&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2025-12-30 04:54:01&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-02 05:43:54&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-05 09:04:16&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-08 10:07:46&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-11 11:51:02&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-14 21:26:37&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-18 04:46:20&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-21 07:20:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-24 11:26:05&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-27 15:53:07&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-01-30 23:54:29&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-03 01:36:00&quot;,&quot;http_code&quot;:404},{&quot;date&quot;:&quot;2026-02-06 05:03:41&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-09 06:01:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-12 06:17:07&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-15 11:04:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-18 11:50:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-21 13:14:40&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-24 13:45:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-02-27 14:34:59&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-02 15:56:57&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-05 23:12:21&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-09 00:11:41&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-12 05:20:40&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-15 10:35:49&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-18 11:22:18&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-21 20:39:28&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-24 21:29:02&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-28 00:59:57&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-03-31 04:24:09&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-03 09:17:05&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-06 10:59:49&quot;,&quot;http_code&quot;:503},{&quot;date&quot;:&quot;2026-04-09 13:52:12&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-12 14:48:26&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-15 17:21:44&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-18 17:46:24&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-21 17:56:07&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-25 07:18:33&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-04-28 09:15:15&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-01 09:58:19&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-04 11:39:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-07 15:57:27&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-10 17:23:26&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-14 00:49:51&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-17 07:38:14&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-20 09:28:10&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-23 13:04:52&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-26 18:03:08&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-05-30 07:37:45&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-02 09:04:31&quot;,&quot;http_code&quot;:200},{&quot;date&quot;:&quot;2026-06-05 10:12:06&quot;,&quot;http_code&quot;:200}],&quot;broken&quot;:false,&quot;last_checked&quot;:{&quot;date&quot;:&quot;2026-06-05 10:12:06&quot;,&quot;http_code&quot;:200},&quot;process&quot;:&quot;done&quot;},{&quot;id&quot;:2641,&quot;href&quot;:&quot;https:\\\/\\\/drive.google.com\\\/file\\\/d\\\/1xuLGmha2w8AG8frat2IvHYZlQoyisuAZ\\\/view?usp=drive_link&quot;,&quot;archived_href&quot;:&quot;&quot;,&quot;redirect_href&quot;:&quot;&quot;,&quot;checks&quot;:[],&quot;broken&quot;:false,&quot;last_checked&quot;:null,&quot;process&quot;:&quot;done&quot;}]'><\/div>\n<p>For any bank or financial organization, credit card fraud detection is of utmost importance. We have to spot potential fraud so that consumers can not bill for goods that they haven&#8217;t purchased. The aim is, therefore, to create a classifier that indicates whether a requested transaction is a fraud.<\/p>\n<h3>About Credit Card Fraud Detection<\/h3>\n<p>In this machine learning project, we solve the problem of detecting credit card fraud transactions using machine numpy, scikit learn, and few other python libraries. We overcome the problem by creating a binary classifier and experimenting with various machine learning techniques to see which fits better.<\/p>\n<h3>Credit Card Fraud Dataset<\/h3>\n<p>The dataset consists of 31 parameters. Due to confidentiality issues, 28 of the features are the result of the PCA transformation. &#8220;Time&#8217; and &#8220;Amount&#8221; are the only aspects that were not modified with PCA.<\/p>\n<p>There are a total of 284,807 transactions with only 492 of them being fraud. So, the label distribution suffers from imbalance issues.<\/p>\n<p>Please download the dataset for credit card fraud detection project: <a href=\"https:\/\/www.kaggle.com\/mlg-ulb\/creditcardfraud\" target=\"_blank\" rel=\"noopener\"><strong>Anonymized Credit Card Transactions for Fraud Detection<\/strong><\/a><\/p>\n<h3>Tools and Libraries used<\/h3>\n<p>We use the following libraries and frameworks in credit card fraud detection project.<\/p>\n<ul>\n<li>Python \u2013 3.x<\/li>\n<li>Numpy &#8211; 1.19.2<\/li>\n<li>Scikit-learn &#8211; 0.24.1<\/li>\n<li>Matplotlib &#8211; 3.3.4<\/li>\n<li>Imblearn &#8211; 0.8.0<\/li>\n<li>Collections, Itertools<\/li>\n<\/ul>\n<h3>Credit Card Fraud Project Code<\/h3>\n<p>Please download the source code of the credit card fraud detection project (which is explained below): <a href=\"https:\/\/drive.google.com\/file\/d\/1xuLGmha2w8AG8frat2IvHYZlQoyisuAZ\/view?usp=drive_link\"><strong>Credit Card Fraud Detection Machine Learning Code<\/strong><\/a><\/p>\n<h3>Steps to Develop Credit Card Fraud Classifier in Machine Learning<\/h3>\n<p>Our approach to building the classifier is discussed in the steps:<\/p>\n<ol>\n<li>Perform Exploratory Data Analysis (EDA) on our dataset<\/li>\n<li>Apply different Machine Learning algorithms to our dataset<\/li>\n<li>Train and Evaluate our models on the dataset and pick the best one.<\/li>\n<\/ol>\n<h4>Step 1. Perform Exploratory Data Analysis (EDA)<\/h4>\n<p>There are a total of 284,807 transactions with only 492 of them being fraud. Let&#8217;s import the necessary modules, load our dataset, and perform EDA on our dataset. Here is a peek at our dataset:<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">import pandas as pd\r\nfrom collections import Counter\r\nimport itertools\r\n\r\n# Load the csv file\r\n\r\ndataframe = pd.read_csv(\".\/Desktop\/DataFlair\/credit_card_fraud_detection\/creditcard.csv\")\r\ndataframe.head()\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95711\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import.png\" alt=\"data import\" width=\"1408\" height=\"733\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-300x156.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-1024x533.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-150x78.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-768x400.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-720x375.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-520x271.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/data-import-320x167.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>Now, check for null values in the credit card dataset. Luckily, there aren&#8217;t any null or NaN values in our dataset.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">dataframe.isnull().values.any()\r\n<\/pre>\n<p>The feature we are most interested in is the &#8220;Amount&#8221;. Here is the summary of the feature.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">dataframe[\"Amount\"].describe()\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95712\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe.png\" alt=\"amount describe\" width=\"1408\" height=\"251\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-300x53.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-1024x183.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-150x27.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-768x137.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-720x128.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-520x93.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/amount-describe-320x57.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>Now, let&#8217;s check the number of occurrences of each class label and plot the information using matplotlib.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">non_fraud = len(dataframe[dataframe.Class == 0])\r\nfraud = len(dataframe[dataframe.Class == 1])\r\nfraud_percent = (fraud \/ (fraud + non_fraud)) * 100\r\n\r\nprint(\"Number of Genuine transactions: \", non_fraud)\r\nprint(\"Number of Fraud transactions: \", fraud)\r\nprint(\"Percentage of Fraud transactions: {:.4f}\".format(fraud_percent))\r\n<\/pre>\n<p>Let&#8217;s plot the above information using matplotlib.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">import matplotlib.pyplot as plt\r\n\r\nlabels = [\"Genuine\", \"Fraud\"]\r\ncount_classes = dataframe.value_counts(dataframe['Class'], sort= True)\r\ncount_classes.plot(kind = \"bar\", rot = 0)\r\nplt.title(\"Visualization of Labels\")\r\nplt.ylabel(\"Count\")\r\nplt.xticks(range(2), labels)\r\nplt.show()\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95713\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot.png\" alt=\"class imbalance plot\" width=\"1408\" height=\"749\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-300x160.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-1024x545.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-150x80.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-768x409.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-720x383.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-520x277.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/class-imbalance-plot-320x170.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>We can observe that the genuine transactions are over 99%! This is not good.<\/p>\n<p>Let&#8217;s apply scaling techniques on the &#8220;Amount&#8221; feature to transform the range of values. We drop the original &#8220;Amount&#8221; column and add a new column with the scaled values. We also drop the &#8220;Time&#8221; column as it is irrelevant.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">import numpy as np\r\nfrom sklearn.preprocessing import StandardScaler\r\n\r\nscaler = StandardScaler()\r\ndataframe[\"NormalizedAmount\"] = scaler.fit_transform(dataframe[\"Amount\"].values.reshape(-1, 1))\r\ndataframe.drop([\"Amount\", \"Time\"], inplace= True, axis= 1)\r\n\r\nY = dataframe[\"Class\"]\r\nX = dataframe.drop([\"Class\"], axis= 1)\r\n<\/pre>\n<p>Now, it&#8217;s time to split credit card data with a split of 70-30 using train_test_split().<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.model_selection import train_test_split\r\n\r\n(train_X, test_X, train_Y, test_Y) = train_test_split(X, Y, test_size= 0.3, random_state= 42)\r\n\r\nprint(\"Shape of train_X: \", train_X.shape)\r\nprint(\"Shape of test_X: \", test_X.shape)\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95714\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split.png\" alt=\"initial data split\" width=\"1408\" height=\"193\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-300x41.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-1024x140.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-150x21.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-768x105.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-720x99.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-520x71.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-data-split-320x44.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<h4>Step 2: Apply Machine Learning Algorithms to Credit Card Dataset<\/h4>\n<p>Let&#8217;s train different models on our dataset and observe which algorithm works better for our problem. This is actually a binary classification problem as we have to predict only 1 of the 2 class labels. We can apply a variety of algorithms for this problem like Random Forest, Decision Tree, Support Vector Machine algorithms, etc.<\/p>\n<p>In this machine learning project, we build Random Forest and Decision Tree classifiers and see which one works best. We address the &#8220;class imbalance&#8221; problem by picking the best-performed model.<\/p>\n<p>But before we go into the code, let&#8217;s understand what random forests and decision trees are.<\/p>\n<p>The Decision Tree algorithm is a supervised machine learning algorithm used for classification and regression tasks. The algorithm&#8217;s aim is to build a training model that predicts the value of a target class variable by learning simple if-then-else decision rules inferred from the training data.<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/10\/Decision-Trees-Example.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-70700\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/10\/Decision-Trees-Example.png\" alt=\"Decision Trees Example - Machine Learning Classification Algorithms\" width=\"476\" height=\"391\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/10\/Decision-Trees-Example.png 476w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/10\/Decision-Trees-Example-150x123.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2019\/10\/Decision-Trees-Example-300x246.png 300w\" sizes=\"auto, (max-width: 476px) 100vw, 476px\" \/><\/a><\/p>\n<p>Random forest (one of the most popular algorithms) is a supervised machine learning algorithm. It creates a &#8220;forest&#8221; out of an ensemble of &#8220;decision trees&#8221;, which are normally trained using the &#8220;bagging&#8221; technique. The bagging method&#8217;s basic principle is that combining different learning models improves the outcome.<\/p>\n<p>To get a more precise and reliable forecast, random forest creates several decision trees and merges them.<\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm.jpg\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95720\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm.jpg\" alt=\"random forest algorithm\" width=\"1200\" height=\"800\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm.jpg 1200w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-300x200.jpg 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-1024x683.jpg 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-150x100.jpg 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-768x512.jpg 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-720x480.jpg 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-520x347.jpg 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-320x213.jpg 320w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-algorithm-272x182.jpg 272w\" sizes=\"auto, (max-width: 1200px) 100vw, 1200px\" \/><\/a><\/p>\n<p>Let&#8217;s build the Random Forest and Decision Tree Classifiers. They are present in the sklearn package in the form of RandomForestClassifier() and DecisionTreeClassifier() respectively.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.ensemble import RandomForestClassifier\r\nfrom sklearn.tree import DecisionTreeClassifier\r\n\r\n#Decision Tree\r\ndecision_tree = DecisionTreeClassifier()\r\n\r\n# Random Forest\r\nrandom_forest = RandomForestClassifier(n_estimators= 100)\r\n<\/pre>\n<h4>Step 3: Train and Evaluate our Models on the Dataset<\/h4>\n<p>Now, Let&#8217;s train and evaluate the newly created models on the dataset and pick the best one.<\/p>\n<p>Train the decision tree and random forest models on the dataset using the fit() function. Record the predictions made by the models using the predict() function and evaluate.<\/p>\n<p>Let&#8217;s visualize the scores of each of our credit card fraud classifiers.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">decision_tree.fit(train_X, train_Y)\r\npredictions_dt = decision_tree.predict(test_X)\r\ndecision_tree_score = decision_tree.score(test_X, test_Y) * 100\r\n\r\nrandom_forest.fit(train_X, train_Y)\r\npredictions_rf = random_forest.predict(test_X)\r\nrandom_forest_score = random_forest.score(test_X, test_Y) * 100\r\n\r\nprint(\"Random Forest Score: \", random_forest_score)\r\nprint(\"Decision Tree Score: \", decision_tree_score)\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95721\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models.png\" alt=\"initial classification models\" width=\"1408\" height=\"484\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-300x103.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-1024x352.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-150x52.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-768x264.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-720x248.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-520x179.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/initial-classification-models-320x110.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>The Random Forest classifier has slightly an edge over the Decision Tree classifier.<\/p>\n<p>Let&#8217;s create a function to print the metrics: accuracy, precision, recall, and f1-score.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from sklearn.metrics import accuracy_score, precision_score, confusion_matrix, recall_score, f1_score\r\n\r\n\r\ndef metrics(actuals, predictions):\r\n    print(\"Accuracy: {:.5f}\".format(accuracy_score(actuals, predictions)))\r\n    print(\"Precision: {:.5f}\".format(precision_score(actuals, predictions)))\r\n    print(\"Recall: {:.5f}\".format(recall_score(actuals, predictions)))\r\n    print(\"F1-score: {:.5f}\".format(f1_score(actuals, predictions)))\r\n<\/pre>\n<p>Let&#8217;s visualize the confusion matrix and the evaluation metrics of our Decision Tree model.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">confusion_matrix_dt = confusion_matrix(test_Y, predictions_dt.round())\r\nprint(\"Confusion Matrix - Decision Tree\")\r\nprint(confusion_matrix_dt)\r\nplot_confusion_matrix(confusion_matrix_dt, classes=[0, 1], title= \"Confusion Matrix - Decision Tree\")\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95722\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree.png\" alt=\"confusion matrix decision tree\" width=\"1408\" height=\"668\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-300x142.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-1024x486.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-150x71.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-768x364.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-720x342.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-520x247.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-decision-tree-320x152.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">print(\"Evaluation of Decision Tree Model\")\r\nprint()\r\nmetrics(test_Y, predictions_dt.round())\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95723\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics.png\" alt=\"decision tree metrics\" width=\"1408\" height=\"225\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-300x48.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-1024x164.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-150x24.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-768x123.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-720x115.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-520x83.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/decision-tree-metrics-320x51.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>Let&#8217;s visualize the confusion matrix and the evaluation metrics of our Random Forest model.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">confusion_matrix_rf = confusion_matrix(test_Y, predictions_rf.round())\r\nprint(\"Confusion Matrix - Random Forest\")\r\nprint(confusion_matrix_rf)\r\nplot_confusion_matrix(confusion_matrix_rf, classes=[0, 1], title= \"Confusion Matrix - Random Forest\")\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95724\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1.png\" alt=\"confusion matrix random forest 1\" width=\"1408\" height=\"659\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-300x140.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-1024x479.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-150x70.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-768x359.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-720x337.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-520x243.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-1-320x150.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">print(\"Evaluation of Random Forest Model\")\r\nprint()\r\nmetrics(test_Y, predictions_rf.round())\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95725\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1.png\" alt=\"random forest metrics 1\" width=\"1408\" height=\"246\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-300x52.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-1024x179.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-150x26.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-768x134.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-720x126.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-520x91.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-1-320x56.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<h4>Address the Class-Imbalance issue<\/h4>\n<p>The Random Forest model works better than Decision Trees. But, if we observe our dataset suffers a serious problem of class imbalance. The genuine (not fraud) transactions are more than 99% with the credit card fraud transactions constituting 0.17%.<\/p>\n<p>With such a distribution, if we train our model without taking care of the imbalance issues, it predicts the label with higher importance given to genuine transactions (as there is more data about them) and hence obtains more accuracy.<\/p>\n<p>The class imbalance problem can be solved by various techniques. Oversampling is one of them.<\/p>\n<p>Oversample the minority class is one of the approaches to address the imbalanced datasets. The easiest solution entails doubling examples in the minority class, even though these examples contribute no new data to the model.<\/p>\n<p>Instead, new examples may be generated by replicating existing ones. The Synthetic Minority Oversampling Technique, or SMOTE for short, is a method of data augmentation for the minority class.<\/p>\n<p>The above SMOTE is present in the imblearn package. Let&#8217;s import that and resample our data.<\/p>\n<p>In the following code below, we resampled our data and we split it using train_test_split() with a split of 70-30.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">from imblearn.over_sampling import SMOTE\r\n\r\nX_resampled, Y_resampled = SMOTE().fit_resample(X, Y)\r\nprint(\"Resampled shape of X: \", X_resampled.shape)\r\nprint(\"Resampled shape of Y: \", Y_resampled.shape)\r\n\r\nvalue_counts = Counter(Y_resampled)\r\nprint(value_counts)\r\n\r\n(train_X, test_X, train_Y, test_Y) = train_test_split(X_resampled, Y_resampled, test_size= 0.3, random_state= 42)\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95726\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split.png\" alt=\"smote split\" width=\"1408\" height=\"348\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-300x74.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-1024x253.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-150x37.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-768x190.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-720x178.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-520x129.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/smote-split-320x79.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>As the Random Forest algorithm performed better than the Decision Tree algorithm, we will apply the Random Forest algorithm to our resampled data.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">rf_resampled = RandomForestClassifier(n_estimators = 100)\r\nrf_resampled.fit(train_X, train_Y)\r\n\r\npredictions_resampled = rf_resampled.predict(test_X)\r\nrandom_forest_score_resampled = rf_resampled.score(test_X, test_Y) * 100\r\n<\/pre>\n<p>Let&#8217;s visualize the predictions of our model and plot the confusion matrix.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">cm_resampled = confusion_matrix(test_Y, y_predict.round())\r\nprint(\"Confusion Matrix - Random Forest\")\r\nprint(cm_resampled)\r\nplot_confusion_matrix(cm_resampled, classes=[0, 1], title= \"Confusion Matrix - Random Forest After Oversampling\")\r\n<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95727\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2.png\" alt=\"confusion matrix random forest 2\" width=\"1408\" height=\"664\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-300x141.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-1024x483.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-150x71.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-768x362.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-520x245.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-720x340.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/confusion-matrix-random-forest-2-320x151.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"generic\">print(\"Evaluation of Random Forest Model\")\r\nprint()\r\n\r\nmetrics(test_Y, predictions_resampled.round())<\/pre>\n<p><strong>Output:<\/strong><\/p>\n<p><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2.png\"><img loading=\"lazy\" decoding=\"async\" class=\"aligncenter size-full wp-image-95728\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2.png\" alt=\"random forest metrics 2\" width=\"1408\" height=\"242\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2.png 1408w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-300x52.png 300w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-1024x176.png 1024w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-150x26.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-768x132.png 768w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-720x124.png 720w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-520x89.png 520w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/random-forest-metrics-2-320x55.png 320w\" sizes=\"auto, (max-width: 1408px) 100vw, 1408px\" \/><\/a><\/p>\n<p>Now, it is clearly evident that our model performed much better than our previous Random Forest classifier without oversampling.<\/p>\n<h3>Summary<\/h3>\n<p>Credit card fraud happens when someone uses your card without permission. This project helps stop such fraud by checking transaction data and finding odd patterns. Using Python and machine learning, we can build a system that reads card data and tells if the transaction is safe or fraud. It is a binary classification problem using real-world datasets.<\/p>\n<p>In this python machine learning project, we built a binary classifier using the Random Forest algorithm to detect credit card fraud transactions. Through this project, we understood and applied techniques to address the class imbalance issues and achieved an accuracy of more than 99%.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>For any bank or financial organization, credit card fraud detection is of utmost importance. We have to spot potential fraud so that consumers can not bill for goods that they haven&#8217;t purchased. The aim&#46;&#46;&#46;<\/p>\n","protected":false},"author":7,"featured_media":95729,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36],"tags":[24391,20622,24380,24381,20697],"class_list":["post-95670","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","tag-credit-card-fraud-classification","tag-credit-card-fraud-detection","tag-credit-card-fraud-project","tag-credit-card-fraud-python-project","tag-machine-learning-project"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.4 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>Credit Card Fraud Detection with Python &amp; Machine Learning - DataFlair<\/title>\n<meta name=\"description\" content=\"Credit Card Fraud Detection with Python &amp; Machine Learning - Create a binary classifier using Decision Tree and Random Forest algorithms.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"Credit Card Fraud Detection with Python &amp; Machine Learning - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Credit Card Fraud Detection with Python &amp; Machine Learning - Create a binary classifier using Decision Tree and Random Forest algorithms.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2021-05-21T04:00:02+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-06-01T09:07:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"9 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"Credit Card Fraud Detection with Python &amp; Machine Learning - DataFlair","description":"Credit Card Fraud Detection with Python & Machine Learning - Create a binary classifier using Decision Tree and Random Forest algorithms.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/","og_locale":"en_US","og_type":"article","og_title":"Credit Card Fraud Detection with Python &amp; Machine Learning - DataFlair","og_description":"Credit Card Fraud Detection with Python & Machine Learning - Create a binary classifier using Decision Tree and Random Forest algorithms.","og_url":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2021-05-21T04:00:02+00:00","article_modified_time":"2026-06-01T09:07:42+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"9 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd"},"headline":"Credit Card Fraud Detection with Python &amp; Machine Learning","datePublished":"2021-05-21T04:00:02+00:00","dateModified":"2026-06-01T09:07:42+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/"},"wordCount":1204,"commentCount":5,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg","keywords":["credit card fraud classification","Credit card fraud detection","credit card fraud project","credit card fraud python project","machine learning project"],"articleSection":["Machine Learning Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/","url":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/","name":"Credit Card Fraud Detection with Python &amp; Machine Learning - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg","datePublished":"2021-05-21T04:00:02+00:00","dateModified":"2026-06-01T09:07:42+00:00","description":"Credit Card Fraud Detection with Python & Machine Learning - Create a binary classifier using Decision Tree and Random Forest algorithms.","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2021\/05\/machine-learning-project-credit-card-fraud-detection.jpg","width":1200,"height":628,"caption":"machine learning project credit card fraud detection"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/credit-card-fraud-detection-python-machine-learning\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Machine Learning Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/machine-learning\/"},{"@type":"ListItem","position":3,"name":"Credit Card Fraud Detection with Python &amp; Machine Learning"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/beb0cab24b7aa54423a3b50e669a9dcd","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/c322416204232f4dd97ef3901b0a499a5d34d7ba7fe333f4bfe53a907873d293?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team specializes in creating clear, actionable content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Backed by industry expertise, we make learning easy and career-oriented for beginners and pros alike.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam3\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/95670","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=95670"}],"version-history":[{"count":10,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/95670\/revisions"}],"predecessor-version":[{"id":148735,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/95670\/revisions\/148735"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/95729"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=95670"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=95670"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=95670"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}