{"id":24036,"date":"2018-08-09T03:30:52","date_gmt":"2018-08-09T03:30:52","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=24036"},"modified":"2026-04-27T17:45:42","modified_gmt":"2026-04-27T12:15:42","slug":"machine-learning-algorithms-in-python","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/","title":{"rendered":"8 Machine Learning Algorithms in Python &#8211; You Must Learn"},"content":{"rendered":"<p>Previously, we discussed the <a href=\"https:\/\/data-flair.training\/blogs\/python-machine-learning-techniques\/\"><strong>techniques\u00a0of machine learning with Python<\/strong><\/a>. Going deeper, today, we will learn and implement 8 top Machine Learning Algorithms in Python.<\/p>\n<p>Think of these 8 algorithms as the only toolkit for your data. Just like a carpenter knows exactly when to use a hammer or a saw, you\u2019re about to learn and teach our Python scripts, how to actually think, categorize, and forecast like a pro.<\/p>\n<p>Let\u2019s begin the journey of Machine Learning Algorithms in Python Programming.<\/p>\n<h3>Machine Learning Algorithms in Python<\/h3>\n<p><strong>The following are the Algorithms of Python Machine Learning:<\/strong><\/p>\n<h4>a. Linear Regression in ML<\/h4>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-linear-regression-chi-square-test\/\">Linear regression<\/a><\/strong> is one of the supervised Machine learning algorithms in Python that observes continuous features and predicts an outcome. Depending on whether it runs on a single variable or on many features, we can call it simple linear regression or multiple linear regression.<\/p>\n<p>This is one of the most popular Python ML algorithms and often under-appreciated. It assigns optimal weights to variables to create a line ax+b to predict the output. We often use linear regression to estimate real values like a number of calls and costs of houses based on continuous variables. The regression line\u00a0is the best line that fits Y=a*X+b to denote a relationship between independent and dependent variables.<\/p>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-machine-learning-environment-setup\/\">Do you know about Python Machine Learning Environment Setup<\/a><\/strong><\/p>\n<p>Let\u2019s plot this for the diabetes dataset.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import matplotlib.pyplot as plt\r\n&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; from sklearn import datasets,linear_model\r\n&gt;&gt;&gt; from sklearn.metrics import mean_squared_error,r2_score\r\n&gt;&gt;&gt; diabetes=datasets.load_diabetes()\r\n&gt;&gt;&gt; diabetes_X=diabetes.data[:,np.newaxis,2]\r\n&gt;&gt;&gt; diabetes_X_train=diabetes_X[:-30] #splitting data into training and test sets\r\n&gt;&gt;&gt; diabetes_X_test=diabetes_X[-30:]\r\n&gt;&gt;&gt; diabetes_y_train=diabetes.target[:-30] #splitting targets into training and test sets\r\n&gt;&gt;&gt; diabetes_y_test=diabetes.target[-30:]\r\n&gt;&gt;&gt; regr=linear_model.LinearRegression() #Linear regression object\r\n&gt;&gt;&gt; regr.fit(diabetes_X_train,diabetes_y_train) #Use training sets to train the model<\/pre>\n<p>LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; diabetes_y_pred=regr.predict(diabetes_X_test) #Make predictions\r\n&gt;&gt;&gt; regr.coef_<\/pre>\n<p>array([941.43097333])<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; mean_squared_error(diabetes_y_test,diabetes_y_pred)<\/pre>\n<p>3035.0601152912695<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; r2_score(diabetes_y_test,diabetes_y_pred) #Variance score<\/pre>\n<p>0.410920728135835<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.scatter(diabetes_X_test,diabetes_y_test,color ='lavender')<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x0584FF70&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.plot(diabetes_X_test,diabetes_y_pred,color='pink',linewidth=3)<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x0584FF30&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xticks(())<\/pre>\n<p>([], &lt;a list of 0 Text xticklabel objects&gt;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.yticks(())<\/pre>\n<p>([], &lt;a list of 0 Text yticklabel objects&gt;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24042\" style=\"width: 561px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/linear-regression-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24042\" class=\"wp-image-24042 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/linear-regression-1.png\" alt=\"Machine Learning Algorithms in Python - You Must LEARN\" width=\"551\" height=\"423\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/linear-regression-1.png 551w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/linear-regression-1-150x115.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/linear-regression-1-300x230.png 300w\" sizes=\"auto, (max-width: 551px) 100vw, 551px\" \/><\/a><p id=\"caption-attachment-24042\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Linear Regression<\/p><\/div>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-ml-data-preprocessing\/\">Python Machine Learning &#8211; Data Preprocessing, Analysis &amp; Visualization<\/a><\/strong><\/p>\n<h4 class=\"western\">b. Logistic Regression in ML<\/h4>\n<p>Logistic regression is a supervised classification algorithm unique among Machine Learning algorithms in Python that finds its use in estimating discrete values like 0\/1, yes\/no, and true\/false. This is based on a given set of independent variables. We use a logistic function to predict the probability of an event, and this gives us an output between 0 and 1.<\/p>\n<p>Although it says \u2018regression\u2019, this is actually a classification algorithm. Logistic regression fits data into a logit function and is also called <i>logit regression<\/i>. Let\u2019s plot this.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; import matplotlib.pyplot as plt\r\n&gt;&gt;&gt; from sklearn import linear_model\r\n&gt;&gt;&gt; xmin,xmax=-7,7 #Test set; straight line with Gaussian noise\r\n&gt;&gt;&gt; n_samples=77\r\n&gt;&gt;&gt; np.random.seed(0)\r\n&gt;&gt;&gt; x=np.random.normal(size=n_samples)\r\n&gt;&gt;&gt; y=(x&gt;0).astype(np.float)\r\n&gt;&gt;&gt; x[x&gt;0]*=3\r\n&gt;&gt;&gt; x+=.4*np.random.normal(size=n_samples)\r\n&gt;&gt;&gt; x=x[:,np.newaxis]\r\n&gt;&gt;&gt; clf=linear_model.LogisticRegression(C=1e4) #Classifier\r\n&gt;&gt;&gt; clf.fit(x,y)\r\n&gt;&gt;&gt; plt.figure(1,figsize=(3,4))\r\n&lt;Figure size 300x400 with 0 Axes&gt;\r\n&gt;&gt;&gt; plt.clf()\r\n&gt;&gt;&gt; plt.scatter(x.ravel(),y,color='lavender',zorder=17)<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x057B0E10&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; x_test=np.linspace(-7,7,277)\r\n&gt;&gt;&gt; def model(x):\r\n         return 1\/(1+np.exp(-x))\r\n&gt;&gt;&gt; loss=model(x_test*clf.coef_+clf.intercept_).ravel()\r\n&gt;&gt;&gt; plt.plot(x_test,loss,color='pink',linewidth=2.5)<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x057BA090&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; ols=linear_model.LinearRegression()\r\n&gt;&gt;&gt; ols.fit(x,y)<\/pre>\n<p>LinearRegression(copy_X=True, fit_intercept=True, n_jobs=1, normalize=False)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.plot(x_test,ols.coef_*x_test+ols.intercept_,linewidth=1)<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x057BA0B0&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.axhline(.4,color='.4')<\/pre>\n<p>&lt;matplotlib.lines.Line2D object at 0x05860E70&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.ylabel('y')<\/pre>\n<p>Text(0,0.5,&#8217;y&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xlabel('x')<\/pre>\n<p>Text(0.5,0,&#8217;x&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xticks(range(-7,7))\r\n&gt;&gt;&gt; plt.yticks([0,0.4,1])\r\n&gt;&gt;&gt; plt.ylim(-.25,1.25)<\/pre>\n<p>(-0.25, 1.25)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xlim(-4,10)<\/pre>\n<p>(-4, 10)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.legend(('Logistic Regression','Linear Regression'),loc='lower right',fontsize='small')<\/pre>\n<p>&lt;matplotlib.legend.Legend object at 0x057C89F0&gt;<br \/>\n<strong><a href=\"https:\/\/data-flair.training\/blogs\/train-test-set-in-python-ml\/\" target=\"_blank\" rel=\"noopener\">Do you know how to split the train and Test Set in Python Machine Learning<\/a><\/strong><\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24043\" style=\"width: 297px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/logistic-regression.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24043\" class=\"wp-image-24043 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/logistic-regression.png\" alt=\"Machine Learning Algorithms in Python \" width=\"287\" height=\"364\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/logistic-regression.png 287w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/logistic-regression-118x150.png 118w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/logistic-regression-237x300.png 237w\" sizes=\"auto, (max-width: 287px) 100vw, 287px\" \/><\/a><p id=\"caption-attachment-24043\" class=\"wp-caption-text\">Machine Learning Algorithms in Python -Logistic<\/p><\/div>\n<h4 class=\"western\">c. Decision Tree in ML<\/h4>\n<p>A decision tree falls under supervised Machine Learning Algorithms in Python and comes in use for both classification and regression, although mostly for classification. This model takes an instance, traverses the tree, and compares important features with a determined conditional statement. Whether, it descends to the left child branch or the right depends on the result. Usually, more important features are closer to the root.<\/p>\n<p>Decision Tree, a machine learning algorithm in Python, can work on both categorical and continuous dependent variables. Here, we split a population into two or more homogeneous sets. Let\u2019s see the algorithm for this-<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.cross_validation import train_test_split\r\n&gt;&gt;&gt; from sklearn.tree import DecisionTreeClassifier\r\n&gt;&gt;&gt; from sklearn.metrics import accuracy_score\r\n&gt;&gt;&gt; from sklearn.metrics import classification_report\r\n&gt;&gt;&gt; def importdata(): #Importing data\r\n balance_data=pd.read_csv('https:\/\/archive.ics.uci.edu\/ml\/machine-learning-'+\r\n'databases\/balance-scale\/balance-scale.data',\r\n   sep= ',', header = None)\r\n          print(len(balance_data))\r\n          print(balance_data.shape)\r\n          print(balance_data.head())\r\n          return balance_data\r\n&gt;&gt;&gt; def splitdataset(balance_data): #Splitting data\r\n          x=balance_data.values[:,1:5]\r\n          y=balance_data.values[:,0]\r\n          x_train,x_test,y_train,y_test=train_test_split(\r\n              x,y,test_size=0.3,random_state=100)\r\n          return x,y,x_train,x_test,y_train,y_test\r\n&gt;&gt;&gt; def train_using_gini(x_train,x_test,y_train): #Training with giniIndex\r\n          clf_gini = DecisionTreeClassifier(criterion = \"gini\",\r\n          random_state = 100,max_depth=3, min_samples_leaf=5)\r\n          clf_gini.fit(x_train,y_train)\r\n          return clf_gini\r\n&gt;&gt;&gt; def train_using_entropy(x_train,x_test,y_train): #Training with entropy\r\n          clf_entropy=DecisionTreeClassifier(\r\n          criterion = \"entropy\", random_state = 100,\r\n          max_depth = 3, min_samples_leaf = 5)\r\n          clf_entropy.fit(x_train,y_train)\r\n          return clf_entropy\r\n&gt;&gt;&gt; def prediction(x_test,clf_object): #Making predictions\r\n          y_pred=clf_object.predict(x_test)\r\n          print(f\"Predicted values: {y_pred}\")\r\n          return y_pred\r\n&gt;&gt;&gt; def cal_accuracy(y_test,y_pred): #Calculating accuracy\r\n          print(confusion_matrix(y_test,y_pred))\r\n          print(accuracy_score(y_test,y_pred)*100)\r\n          print(classification_report(y_test,y_pred))\r\n&gt;&gt;&gt; data=importdata()<\/pre>\n<p>625<br \/>\n(625, 5)<br \/>\n0 1 2 3 4<br \/>\n0 B 1 1 1 1<br \/>\n1 R 1 1 1 2<br \/>\n2 R 1 1 1 3<br \/>\n3 R 1 1 1 4<br \/>\n4 R 1 1 1 5<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; x,y,x_train,x_test,y_train,y_test=splitdataset(data)\r\n&gt;&gt;&gt; clf_gini=train_using_gini(x_train,x_test,y_train)\r\n&gt;&gt;&gt; clf_entropy=train_using_entropy(x_train,x_test,y_train)\r\n&gt;&gt;&gt; y_pred_gini=prediction(x_test,clf_gini)<\/pre>\n<div id=\"attachment_24044\" style=\"width: 740px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/1-18.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24044\" class=\"wp-image-24044 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/1-18.png\" alt=\"Machine Learning Algorithms in Python \" width=\"730\" height=\"193\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/1-18.png 730w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/1-18-150x40.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/1-18-300x79.png 300w\" sizes=\"auto, (max-width: 730px) 100vw, 730px\" \/><\/a><p id=\"caption-attachment-24044\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Decision Tree<\/p><\/div>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; cal_accuracy(y_test,y_pred_gini)<\/pre>\n<p>[[ 0 6 7]<br \/>\n[ 0 67 18]<br \/>\n[ 0 19 71]]<br \/>\n73.40425531914893<\/p>\n<div id=\"attachment_24045\" style=\"width: 466px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/2-16.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24045\" class=\"wp-image-24045 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/2-16.png\" alt=\"Machine Learning Algorithms in Python \" width=\"456\" height=\"123\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/2-16.png 456w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/2-16-150x40.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/2-16-300x81.png 300w\" sizes=\"auto, (max-width: 456px) 100vw, 456px\" \/><\/a><p id=\"caption-attachment-24045\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Decision Tree<\/p><\/div>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred_entropy=prediction(x_test,clf_entropy)<\/pre>\n<div id=\"attachment_24046\" style=\"width: 745px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/3-14.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24046\" class=\"wp-image-24046 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/3-14.png\" alt=\"Machine Learning Algorithms in Python \" width=\"735\" height=\"196\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/3-14.png 735w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/3-14-150x40.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/3-14-300x80.png 300w\" sizes=\"auto, (max-width: 735px) 100vw, 735px\" \/><\/a><p id=\"caption-attachment-24046\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Decision Tree<\/p><\/div>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; cal_accuracy(y_test,y_pred_entropy)<\/pre>\n<p>[[ 0 6 7]<br \/>\n[ 0 63 22]<br \/>\n[ 0 20 70]]<br \/>\n70.74468085106383<\/p>\n<div id=\"attachment_24047\" style=\"width: 447px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/4-13.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24047\" class=\"wp-image-24047 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/4-13.png\" alt=\"Machine Learning Algorithms in Python \" width=\"437\" height=\"117\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/4-13.png 437w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/4-13-150x40.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/4-13-300x80.png 300w\" sizes=\"auto, (max-width: 437px) 100vw, 437px\" \/><\/a><p id=\"caption-attachment-24047\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Decision Tree Algorithm i<\/p><\/div>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-machine-learning-techniques\/\">Let&#8217;s explore 4 Machine Learning Techniques with Python<\/a> <\/strong><\/p>\n<h4 class=\"western\">d. Support Vector Machines (SVM) in ML<\/h4>\n<p>SVM is a supervised classification algorithm that is one of the most important Machine Learning algorithms in Python, which plots a line that divides different categories of your data. However, in this ML algorithm, we calculate the vector to optimize the line.<\/p>\n<p>This is to ensure that the closest point in each group lies farthest from the others. While you will almost always find this to be a linear vector, it can be other than that.<\/p>\n<p>In this Python Machine Learning tutorial, we plot each data item as a point in an n-dimensional space. We have n features, and each feature has the value of a certain coordinate.<\/p>\n<p>First, let\u2019s plot a dataset.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.datasets.samples_generator import make_blobs\r\n&gt;&gt;&gt; x,y=make_blobs(n_samples=500,centers=2,\r\n           random_state=0,cluster_std=0.40)\r\n&gt;&gt;&gt; import matplotlib.pyplot as plt\r\n&gt;&gt;&gt; plt.scatter(x[:,0],x[:,1],c=y,s=50,cmap='plasma')<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x04E1BBF0&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24048\" style=\"width: 566px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm1-1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24048\" class=\"wp-image-24048 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm1-1.png\" alt=\"Machine Learning Algorithms in Python \" width=\"556\" height=\"435\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm1-1.png 556w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm1-1-150x117.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm1-1-300x235.png 300w\" sizes=\"auto, (max-width: 556px) 100vw, 556px\" \/><\/a><p id=\"caption-attachment-24048\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; SVM<\/p><\/div>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; xfit=np.linspace(-1,3.5)\r\n&gt;&gt;&gt; plt.scatter(X[:, 0], X[:, 1], c=Y, s=50, cmap='plasma')<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x07318C90&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; for m, b, d in [(1, 0.65, 0.33), (0.5, 1.6, 0.55), (-0.2, 2.9, 0.2)]:\r\n        yfit = m * xfit + b\r\n        plt.plot(xfit, yfit, '-k')\r\n        plt.fill_between(xfit, yfit - d, yfit + d, edgecolor='none',\r\n        color='#AFFEDC', alpha=0.4)<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x07318FF0&gt;]<br \/>\n&lt;matplotlib.collections.PolyCollection object at 0x073242D0&gt;<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x07318B70&gt;]<br \/>\n&lt;matplotlib.collections.PolyCollection object at 0x073246F0&gt;<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x07324370&gt;]<br \/>\n&lt;matplotlib.collections.PolyCollection object at 0x07324B30&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xlim(-1,3.5)<\/pre>\n<p>(-1, 3.5)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24049\" style=\"width: 569px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm2.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24049\" class=\"wp-image-24049 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm2.png\" alt=\"Machine Learning Algorithms in Python \" width=\"559\" height=\"431\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm2.png 559w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm2-150x116.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/svm2-300x231.png 300w\" sizes=\"auto, (max-width: 559px) 100vw, 559px\" \/><\/a><p id=\"caption-attachment-24049\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Support Vector Machine<\/p><\/div>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-pyqt5-tutorial\/\">Follow this link to know about Python PyQt5 Tutorial<\/a><\/strong><\/p>\n<h4>e. Naive Bayes in ML<\/h4>\n<p>Naive Bayes is a classification method that is based on Bayes\u2019 theorem. This assumes independence between predictors. A Naive Bayes classifier will assume that a feature in a class is unrelated to any other. Consider a fruit. This is an apple if it is round, red, and 2.5 inches in diameter. In addition, a Naive Bayes classifier will say these characteristics independently contribute to the probability of the fruit being an apple. This is even if features depend on each other.<\/p>\n<p>Furthermore, for very large data sets, it is easy to build a Naive Bayesian model. Not only is this model very simple, but it also performs better than many highly sophisticated classification methods. Let\u2019s build this.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.naive_bayes import GaussianNB\r\n&gt;&gt;&gt; from sklearn.naive_bayes import MultinomialNB\r\n&gt;&gt;&gt; from sklearn import datasets\r\n&gt;&gt;&gt; from sklearn.metrics import confusion_matrix\r\n&gt;&gt;&gt; from sklearn.model_selection import train_test_split\r\n&gt;&gt;&gt; iris=datasets.load_iris()\r\n&gt;&gt;&gt; x=iris.data\r\n&gt;&gt;&gt; y=iris.target\r\n&gt;&gt;&gt; x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.3,random_state=0)\r\n&gt;&gt;&gt; gnb=GaussianNB()\r\n&gt;&gt;&gt; mnb=MultinomialNB()\r\n&gt;&gt;&gt; y_pred_gnb=gnb.fit(x_train,y_train).predict(x_test)\r\n&gt;&gt;&gt; cnf_matrix_gnb = confusion_matrix(y_test, y_pred_gnb)\r\n&gt;&gt;&gt; cnf_matrix_gnb<\/pre>\n<p>array([[16, 0, 0],<br \/>\n[ 0, 18, 0],<br \/>\n[ 0, 0, 11]], dtype=int64)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred_mnb = mnb.fit(x_train, y_train).predict(x_test)\r\n&gt;&gt;&gt; cnf_matrix_mnb = confusion_matrix(y_test, y_pred_mnb)\r\n&gt;&gt;&gt; cnf_matrix_mnb<\/pre>\n<p>array([[16, 0, 0],<br \/>\n[ 0, 0, 18],<br \/>\n[ 0, 0, 11]], dtype=int64)<\/p>\n<h4 class=\"western\">f. kNN (k-Nearest Neighbors) in ML<\/h4>\n<p>This is a Python machine learning algorithm for classification and regression, mostly for classification. This is a supervised learning algorithm that considers different centroids and uses a Euclidean function to compare distances.<\/p>\n<p>Then, it analyzes the results and classifies each point into a group to optimize it to place with all the closest points to it. It classifies new cases using a majority vote of k of its neighbors. Hence, the case it assigns to a class is the one most common among its K nearest neighbors. For this, it uses a distance function.<\/p>\n<p><strong>i. Training and testing on the entire dataset<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.datasets import load_iris\r\n&gt;&gt;&gt; iris=load_iris()\r\n&gt;&gt;&gt; x=iris.data\r\n&gt;&gt;&gt; y=iris.target\r\n&gt;&gt;&gt; from sklearn.linear_model import LogisticRegression\r\n&gt;&gt;&gt; logreg=LogisticRegression()\r\n&gt;&gt;&gt; logreg.fit(x,y)<\/pre>\n<p>LogisticRegression(C=1.0, class_weight=None, dual=False, fit_intercept=True,<br \/>\nintercept_scaling=1, max_iter=100, multi_class=&#8217;ovr&#8217;, n_jobs=1,<br \/>\npenalty=&#8217;l2&#8242;, random_state=None, solver=&#8217;liblinear&#8217;, tol=0.0001,<br \/>\nverbose=0, warm_start=False)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; logreg.predict(x)<\/pre>\n<p>array([0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br \/>\n0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0,<br \/>\n0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1,<br \/>\n2, 1, 1, 1, 2, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 1, 1,<br \/>\n1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2,<br \/>\n2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 1, 2, 2,<br \/>\n2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2, 2])<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred=logreg.predict(x)\r\n&gt;&gt;&gt; len(y_pred)<\/pre>\n<p>150<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn import metrics\r\n&gt;&gt;&gt; metrics.accuracy_score(y,y_pred)<\/pre>\n<p>0.96<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.neighbors import KNeighborsClassifier\r\n&gt;&gt;&gt; knn=KNeighborsClassifier(n_neighbors=5)\r\n&gt;&gt;&gt; knn.fit(x,y)<\/pre>\n<p>KNeighborsClassifier(algorithm=&#8217;auto&#8217;, leaf_size=30, metric=&#8217;minkowski&#8217;,<br \/>\nmetric_params=None, n_jobs=1, n_neighbors=5, p=2,<br \/>\nweights=&#8217;uniform&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred=knn.predict(x)\r\n&gt;&gt;&gt; metrics.accuracy_score(y,y_pred)<\/pre>\n<p>0.9666666666666667<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; knn=KNeighborsClassifier(n_neighbors=1)\r\n&gt;&gt;&gt; knn.fit(x,y)<\/pre>\n<p>KNeighborsClassifier(algorithm=&#8217;auto&#8217;, leaf_size=30, metric=&#8217;minkowski&#8217;,<br \/>\nmetric_params=None, n_jobs=1, n_neighbors=1, p=2,<br \/>\nweights=&#8217;uniform&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred=knn.predict(x)\r\n&gt;&gt;&gt; metrics.accuracy_score(y,y_pred)<\/pre>\n<p>1.0<\/p>\n<p><strong>ii. Splitting into train\/test<\/strong><\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; x.shape<\/pre>\n<p>(150, 4)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y.shape<\/pre>\n<p>(150,)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.cross_validation import train_test_split\r\n&gt;&gt;&gt; x.shape<\/pre>\n<p>(150, 4)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y.shape<\/pre>\n<p>(150,)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.cross_validation import train_test_split\r\n&gt;&gt;&gt; x_train,x_test,y_train,y_test=train_test_split(x,y,test_size=0.4,random_state=4)\r\n&gt;&gt;&gt; x_train.shape<\/pre>\n<p>(90, 4)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; x_test.shape\r\n<\/pre>\n<p>(60, 4)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_train.shape<\/pre>\n<p>(90,)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_test.shape<\/pre>\n<p>(60,)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; logreg=LogisticRegression()\r\n&gt;&gt;&gt; logreg.fit(x_train,y_train)\r\n&gt;&gt;&gt; y_pred=knn.predict(x_test)\r\n&gt;&gt;&gt; metrics.accuracy_score(y_test,y_pred)<\/pre>\n<p>0.9666666666666667<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; knn=KNeighborsClassifier(n_neighbors=5)\r\n&gt;&gt;&gt; knn.fit(x_train,y_train)<\/pre>\n<p>KNeighborsClassifier(algorithm=&#8217;auto&#8217;, leaf_size=30, metric=&#8217;minkowski&#8217;,<br \/>\nmetric_params=None, n_jobs=1, n_neighbors=5, p=2,<br \/>\nweights=&#8217;uniform&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; y_pred=knn.predict(x_test)\r\n&gt;&gt;&gt; metrics.accuracy_score(y_test,y_pred)<\/pre>\n<p>0.9666666666666667<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; k_range=range(1,26)\r\n&gt;&gt;&gt; scores=[]\r\n&gt;&gt;&gt; for k in k_range:\r\n         knn = KNeighborsClassifier(n_neighbors=k)\r\n         knn.fit(x_train, y_train)\r\n         y_pred = knn.predict(x_test)\r\n         scores.append(metrics.accuracy_score(y_test, y_pred))\r\n&gt;&gt;&gt; scores<\/pre>\n<p>[0.95, 0.95, 0.9666666666666667, 0.9666666666666667, 0.9666666666666667, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9833333333333333, 0.9666666666666667, 0.9833333333333333, 0.9666666666666667, 0.9666666666666667, 0.9666666666666667, 0.9666666666666667, 0.95, 0.95]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import matplotlib.pyplot as plt\r\n&gt;&gt;&gt; plt.plot(k_range,scores)<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x05FDECD0&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.xlabel('k for kNN')<\/pre>\n<p>Text(0.5,0,&#8217;k for kNN&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.ylabel('Testing Accuracy')<\/pre>\n<p>Text(0,0.5,&#8217;Testing Accuracy&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24050\" style=\"width: 604px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/knn.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24050\" class=\"wp-image-24050 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/knn.png\" alt=\"Machine Learning Algorithms in Python \" width=\"594\" height=\"439\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/knn.png 594w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/knn-150x111.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/knn-300x222.png 300w\" sizes=\"auto, (max-width: 594px) 100vw, 594px\" \/><\/a><p id=\"caption-attachment-24050\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211;\u00a0k-Nearest Neighbors<\/p><\/div>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-statistics\/\">Read about Python Statistics &#8211; p-Value, Correlation, T-test, KS Test<\/a><\/strong><\/p>\n<h4>g. k-Means in ML<\/h4>\n<p>k-Means is an unsupervised algorithm that solves the problem of clustering. It classifies data using several clusters. The data points inside a class are homogeneous and heterogeneous to peer groups.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; import matplotlib.pyplot as plt\r\n&gt;&gt;&gt; from matplotlib import style\r\n&gt;&gt;&gt; style.use('ggplot')\r\n&gt;&gt;&gt; from sklearn.cluster import KMeans\r\n&gt;&gt;&gt; x=[1,5,1.5,8,1,9]\r\n&gt;&gt;&gt; y=[2,8,1.7,6,0.2,12]\r\n&gt;&gt;&gt; plt.scatter(x,y)<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x0642AF30&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; x=np.array([[1,2],[5,8],[1.5,1.8],[8,8],[1,0.6],[9,11]])\r\n&gt;&gt;&gt; kmeans=KMeans(n_clusters=2)\r\n&gt;&gt;&gt; kmeans.fit(x)<\/pre>\n<p>KMeans(algorithm=&#8217;auto&#8217;, copy_x=True, init=&#8217;k-means++&#8217;, max_iter=300,<br \/>\nn_clusters=2, n_init=10, n_jobs=1, precompute_distances=&#8217;auto&#8217;,<br \/>\nrandom_state=None, tol=0.0001, verbose=0)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; centroids=kmeans.cluster_centers_\r\n&gt;&gt;&gt; labels=kmeans.labels_\r\n&gt;&gt;&gt; centroids<\/pre>\n<p>array([[1.16666667, 1.46666667],<br \/>\n[7.33333333, 9. ]])<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; labels<\/pre>\n<p>array([0, 1, 0, 1, 0, 1])<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; colors=['g.','r.','c.','y.']\r\n&gt;&gt;&gt; for i in range(len(x)):\r\n         print(x[i],labels[i])\r\n         plt.plot(x[i][0],x[i][1],colors[labels[i]],markersize=10)<\/pre>\n<p>[1. 2.] 0<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x0642AE10&gt;]<br \/>\n[5. 8.] 1<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x06438930&gt;]<br \/>\n[1.5 1.8] 0<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x06438BF0&gt;]<br \/>\n[8. 8.] 1<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x06438EB0&gt;]<br \/>\n[1. 0.6] 0<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x06438FB0&gt;]<br \/>\n[ 9. 11.] 1<br \/>\n[&lt;matplotlib.lines.Line2D object at 0x043B1410&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.scatter(centroids[:,0],centroids[:,1],marker='x',s=150,linewidths=5,zorder=10)<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x043B14D0&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; plt.show()<\/pre>\n<div id=\"attachment_24051\" style=\"width: 565px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/kmeans.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24051\" class=\"wp-image-24051 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/kmeans.png\" alt=\"Machine Learning Algorithms in Python - You Must LEARN\" width=\"555\" height=\"426\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/kmeans.png 555w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/kmeans-150x115.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/kmeans-300x230.png 300w\" sizes=\"auto, (max-width: 555px) 100vw, 555px\" \/><\/a><p id=\"caption-attachment-24051\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; K-Means<\/p><\/div>\n<p><strong><a href=\"https:\/\/data-flair.training\/blogs\/python-descriptive-statistics\/\">Have a Look at Python Descriptive Statistics \u2013 Measuring Central Tendency<\/a><\/strong><\/p>\n<h4 class=\"western\">h. Random Forest in ML<\/h4>\n<p>A random forest is an ensemble of decision trees. Now, to classify every new object based on its attributes, trees vote for a class- each tree provides a classification. The classification with the most votes wins in the forest.<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; import pylab as pl\r\n&gt;&gt;&gt; x=np.random.uniform(1,100,1000)\r\n&gt;&gt;&gt; y=np.log(x)+np.random.normal(0,.3,1000)\r\n&gt;&gt;&gt; pl.scatter(x,y,s=1,label='log(x) with noise')<\/pre>\n<p>&lt;matplotlib.collections.PathCollection object at 0x0434EC50&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.plot(np.arange(1,100),np.log(np.arange(1,100)),c='b',label='log(x) true function')<\/pre>\n<p>[&lt;matplotlib.lines.Line2D object at 0x0434EB30&gt;]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.xlabel('x')<\/pre>\n<p>Text(0.5,0,&#8217;x&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.ylabel('f(x)=log(x)')<\/pre>\n<p>Text(0,0.5,&#8217;f(x)=log(x)&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.legend(loc='best')<\/pre>\n<p>&lt;matplotlib.legend.Legend object at 0x04386450&gt;<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.title('A basic log function')\u00a0\r\n<\/pre>\n<p>Text(0.5,1,&#8217;A basic log function&#8217;)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; pl.show()<\/pre>\n<div id=\"attachment_24052\" style=\"width: 579px\" class=\"wp-caption aligncenter\"><a href=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/forest1.png\"><img loading=\"lazy\" decoding=\"async\" aria-describedby=\"caption-attachment-24052\" class=\"wp-image-24052 size-full\" src=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/forest1.png\" alt=\"Machine Learning Algorithms in Python \" width=\"569\" height=\"447\" srcset=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/forest1.png 569w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/forest1-150x118.png 150w, https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/forest1-300x236.png 300w\" sizes=\"auto, (max-width: 569px) 100vw, 569px\" \/><\/a><p id=\"caption-attachment-24052\" class=\"wp-caption-text\">Machine Learning Algorithms in Python &#8211; Random Forest<\/p><\/div>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; from sklearn.datasets import load_iris\r\n&gt;&gt;&gt; from sklearn.ensemble import RandomForestClassifier\r\n&gt;&gt;&gt; import pandas as pd\r\n&gt;&gt;&gt; import numpy as np\r\n&gt;&gt;&gt; iris=load_iris()\r\n&gt;&gt;&gt; df=pd.DataFrame(iris.data,columns=iris.feature_names)\r\n&gt;&gt;&gt; df['is_train']=np.random.uniform(0,1,len(df))&lt;=.75\r\n&gt;&gt;&gt; df['species']=pd.Categorical.from_codes(iris.target,iris.target_names)\r\n&gt;&gt;&gt; df.head()<\/pre>\n<p>sepal length (cm) sepal width (cm) &#8230; is_train species<br \/>\n0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a05.1\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3.5 &#8230;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 True setosa<br \/>\n1\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a04.9\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3.0 &#8230;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 True setosa<br \/>\n2\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a04.7\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3.2 &#8230;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 True setosa<br \/>\n3\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a04.6\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3.1 &#8230;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 True setosa<br \/>\n4\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a05.0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 3.6 &#8230;\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0False setosa<br \/>\n[5 rows x 6 columns]<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; train,test=df[df['is_train']==True],df[df['is_train']==False]\r\n&gt;&gt;&gt; features=df.columns[:4]\r\n&gt;&gt;&gt; clf=RandomForestClassifier(n_jobs=2)\r\n&gt;&gt;&gt; y,_=pd.factorize(train['species'])\r\n&gt;&gt;&gt; clf.fit(train[features],y)<\/pre>\n<p>RandomForestClassifier(bootstrap=True, class_weight=None, criterion=&#8217;gini&#8217;,<br \/>\nmax_depth=None, max_features=&#8217;auto&#8217;, max_leaf_nodes=None,<br \/>\nmin_impurity_decrease=0.0, min_impurity_split=None,<br \/>\nmin_samples_leaf=1, min_samples_split=2,<br \/>\nmin_weight_fraction_leaf=0.0, n_estimators=10, n_jobs=2,<br \/>\noob_score=False, random_state=None, verbose=0,<br \/>\nwarm_start=False)<\/p>\n<pre class=\"EnlighterJSRAW\">&gt;&gt;&gt; preds=iris.target_names[clf.predict(test[features])]\r\n&gt;&gt;&gt; pd.crosstab(test['species'],preds,rownames=['actual'],colnames=['preds'])<\/pre>\n<p><strong>preds\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 setosa\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 versicolor\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0virginica<\/strong><br \/>\nactual<br \/>\nsetosa\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 12\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 0\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00<br \/>\nversicolor\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 17\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a02<br \/>\nvirginica\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a00\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a01\u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 \u00a0 15<\/p>\n<p>Other important techniques in AI are deep learning, which includes convolutional neural networks (CNNs) and recurrent neural networks (RNNs).<\/p>\n<p>These algorithms make up some of the modern preferences that include image and speech recognition, natural language processing, and others. Still, the conventional methods of machine learning, like the ones discussed here, are basic, and deep learning methods have added possibilities and are used in the contemporary methodologies of machine learning and AI.<\/p>\n<p>So, this was all about Machine Learning Algorithms in Python Tutorial. Hope you like our explanation.<\/p>\n<h3 class=\"western\">Conclusion<\/h3>\n<p>Python has many ready-to-use machine learning algorithms. These are like recipes for solving problems. Some popular ones are Linear Regression, Decision Tree, Random Forest, Support Vector Machine (SVM), K-Nearest Neighbors (KNN), and Naive Bayes. Each algorithm works in a different way and fits different types of problems.<\/p>\n<p>For example, Linear Regression is good for predicting numbers, like sales or prices. Decision Trees are easy to understand and great for yes\/no problems. Random Forest is a powerful version of decision trees that reduces errors.<\/p>\n<p>As a result, SVM is strong when there is a clear gap between classes. KNN is simple and works by comparing new data to old data. Naive Bayes is often used for spam detection or sentiment analysis.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>Previously, we discussed the techniques\u00a0of machine learning with Python. Going deeper, today, we will learn and implement 8 top Machine Learning Algorithms in Python. Think of these 8 algorithms as the only toolkit for&#46;&#46;&#46;<\/p>\n","protected":false},"author":5,"featured_media":24053,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[36,46],"tags":[3626,7839,8041,8287,8388,8994,10670,11304,13987],"class_list":["post-24036","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-machine-learning","category-python","tag-decision-tree","tag-k-means","tag-knn-k-nearest-neighbors","tag-linear-regression","tag-logistic-regression","tag-naive-bayes","tag-python-machine-learning-algorithm","tag-random-forest","tag-support-vector-machines-svm"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v28.0 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>8 Machine Learning Algorithms in Python - You Must Learn - DataFlair<\/title>\n<meta name=\"description\" content=\"Machine Learning Algorithms in Python - Linear regression,Logistic Regression,Decision Tree, Support Vector Machines,Naive Bayes, kNN,k-Means, Random Forest\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"8 Machine Learning Algorithms in Python - You Must Learn - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Machine Learning Algorithms in Python - Linear regression,Logistic Regression,Decision Tree, Support Vector Machines,Naive Bayes, kNN,k-Means, Random Forest\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2018-08-09T03:30:52+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2026-04-27T12:15:42+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"10 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"8 Machine Learning Algorithms in Python - You Must Learn - DataFlair","description":"Machine Learning Algorithms in Python - Linear regression,Logistic Regression,Decision Tree, Support Vector Machines,Naive Bayes, kNN,k-Means, Random Forest","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/","og_locale":"en_US","og_type":"article","og_title":"8 Machine Learning Algorithms in Python - You Must Learn - DataFlair","og_description":"Machine Learning Algorithms in Python - Linear regression,Logistic Regression,Decision Tree, Support Vector Machines,Naive Bayes, kNN,k-Means, Random Forest","og_url":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2018-08-09T03:30:52+00:00","article_modified_time":"2026-04-27T12:15:42+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"10 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/7f83c342f5d1632d6f7b4b0b0f447823"},"headline":"8 Machine Learning Algorithms in Python &#8211; You Must Learn","datePublished":"2018-08-09T03:30:52+00:00","dateModified":"2026-04-27T12:15:42+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/"},"wordCount":1832,"commentCount":3,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg","keywords":["Decision Tree","k-Means","kNN (k-Nearest Neighbors)","Linear Regression","Logistic Regression","Naive Bayes","Python Machine learning algorithm","Random Forest","Support Vector Machines (SVM)"],"articleSection":["Machine Learning Tutorials","Python Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/","url":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/","name":"8 Machine Learning Algorithms in Python - You Must Learn - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg","datePublished":"2018-08-09T03:30:52+00:00","dateModified":"2026-04-27T12:15:42+00:00","description":"Machine Learning Algorithms in Python - Linear regression,Logistic Regression,Decision Tree, Support Vector Machines,Naive Bayes, kNN,k-Means, Random Forest","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2018\/08\/8-Machine-Learning-Algorithms-to-Learn-01-1.jpg","width":1200,"height":628,"caption":"8 Python Machine Learning Algorithms - You Must LEARN"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/machine-learning-algorithms-in-python\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"Machine Learning Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/machine-learning\/"},{"@type":"ListItem","position":3,"name":"8 Machine Learning Algorithms in Python &#8211; You Must Learn"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/7f83c342f5d1632d6f7b4b0b0f447823","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/4cf3a74600d131330b8c481d519afd1574093ed89f6d3396a95393ad223eb7cd?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"DataFlair Team creates expert-level guides on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our goal is to empower learners with easy-to-understand content. Explore our resources for career growth and practical learning.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam1\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/24036","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/5"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=24036"}],"version-history":[{"count":10,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/24036\/revisions"}],"predecessor-version":[{"id":147961,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/24036\/revisions\/147961"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/24053"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=24036"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=24036"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=24036"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}