

{"id":78563,"date":"2020-09-12T09:00:44","date_gmt":"2020-09-12T03:30:44","guid":{"rendered":"https:\/\/data-flair.training\/blogs\/?p=78563"},"modified":"2021-05-09T13:13:23","modified_gmt":"2021-05-09T07:43:23","slug":"scipy-statistical-functions","status":"publish","type":"post","link":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/","title":{"rendered":"SciPy Stats &#8211; Statistical Functions in SciPy"},"content":{"rendered":"<p>The SciPy library consists of a package for statistical functions. The scipy.stats is the SciPy sub-package. It is mainly used for probabilistic distributions and statistical operations. There is a wide range of probability functions. The statistical functionality is expanding as the library is open-source.<\/p>\n<p>We have functions for both continuous and discrete variables and can work with different types of distributions like the binomial, uniform, and continuous. We can also perform the T-test and determine the T-score. Let us learn more about SciPy Stats.<\/p>\n<h2>SciPy Stats<\/h2>\n<p>It consists of a large number of probability distribution and statistical functions. We can display all the available functions using the inf(stats)command. We can also display a list of random variables from the docstring of the package.<\/p>\n<p>SciPy Stats consists of the following three classes:<\/p>\n<h3>1. rv_continuous<\/h3>\n<p>It is a generic base class through which we can construct specific distribution sub-classes and instances for continuous random variables.<\/p>\n<h3>2. rv_discrete<\/h3>\n<p>It is a generic base class through which we can construct specific distribution sub-classes and instances for discrete random variables.<\/p>\n<h3>3. rv_histogram<\/h3>\n<p>We can use it to generate specific distribution histograms. It can also be inherited from the class.<\/p>\n<p>There are functions available in SciPy which we can import and then perform the operations. These functions inherent properties of either of the classes available in the package. We generally have <strong>rv_continuous<\/strong> and<strong> rv_discrete<\/strong> to implement two different distributions.<\/p>\n<h2>Normal Continuous Random Distribution in SciPy<\/h2>\n<p>In this type of probability distribution, the variable can take any value. Hence, it is known as a continuous random variable.<\/p>\n<p>Here we import the norm function which inherits from the rv_continuous class. The functions include methods and details to work on the specific continuous distribution.<\/p>\n<p>We use the norm function to calculate the cdf on an array.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">from scipy.stats import norm\r\nimport numpy as np\r\na=np.array([2,-1,4,1,3,0])\r\nprint(norm.cdf(a))\r\n \r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<div class=\"code-output\">[0.97724987 0.15865525 0.99996833 0.84134475 0.9986501 0.5 ]<\/div>\n<p>We can also find the median of the distribution using the Percent Point Function. PPF is actually the inverse value of CDF.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">from scipy.stats import norm\r\nimport numpy as np\r\na=np.array([0.97724987,0.15865525,0.99996833, 0.84134475, 0.9986501,0.5])\r\nprint(norm.ppf(a))\r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<div class=\"code-output\">[ 2.00000004 -1.00000002 4.00000928 1.00000002 2.99999956 0. ]<\/div>\n<h2>Uniform Distribution in SciPy<\/h2>\n<p>Similarly, we can generate a uniform distribution. We need to import the uniform function and then generate the CDF of the array.<\/p>\n<p>We can increase the functionality with the use of scale and loc keyword. The scale keyword defines the standard deviation and the loc defines the mean value.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">from scipy.stats import uniform\r\na=np.array([9,8,7,3,2])\r\nprint (uniform.cdf(a, loc =5 , scale = 3))\r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<div class=\"code-output\">[1. 1. 0.66666667 0. 0. ]<\/div>\n<h2>Binomial Distribution in SciPy<\/h2>\n<p>We can generate a binomial distribution by importing binom the instance of rv_discrete class. It consists of methods and details from class.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">from scipy.stats import binom\r\na=np.array([9,8,7,3,2])\r\nprint (binom.cdf(a,n=2,p=5))\r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<div class=\"code-output\">[1. 1. 1. 1. 1.]<\/div>\n<h2>SciPy Descriptive Statistics<\/h2>\n<p>We use descriptive statistical functions to decode certain values from the output. These functions evaluate min, max, mean values from the input NumPy arrays. Some of the functions in stats are:<\/p>\n<ul>\n<li><strong>describe()-<\/strong> it returns descriptive stats of the arrays<\/li>\n<li><strong>gmean()-<\/strong> it returns the geometric mean along a specific axis of an array<\/li>\n<li><strong>hmean()-<\/strong> it returns the harmonic mean along a specific axis of an array<\/li>\n<li><strong>sem()-<\/strong> it returns the standard error mean of the mean<\/li>\n<li><strong>kurtosis()-<\/strong> it returns the kurtosis value of an array<\/li>\n<li><strong>mode()-<\/strong> it returns the mode of an array<\/li>\n<li><strong>skew()-<\/strong> it is to perform the skew test on an array<\/li>\n<li><strong>zscore()-<\/strong> it returns the z-score relative to the mean and standard deviation values.<\/li>\n<\/ul>\n<h2>T-Test in SciPy<\/h2>\n<p>We perform the T-test to evaluate the difference between the mean (average) values of two arrays. We consider the value of T-Test as a significant difference in the two data sets.<\/p>\n<h2>T-score<\/h2>\n<p>T-score is the concept of relativity. We compute the ratio between the two sets of data. The T-score value describes the difference in arrays. The smaller the value, the more similar are the arrays and vice versa.<\/p>\n<p>The two data sets for comparison can be of any type. The two arrays can even follow dissimilar distribution patterns.<\/p>\n<pre class=\"EnlighterJSRAW\" data-enlighter-language=\"null\">from scipy import stats  \r\na = stats.norm.rvs(loc = 2, scale = 1, size = (10,5))  \r\nprint(stats.ttest_1samp(a,2.0))  \r\n<\/pre>\n<p><strong>Output<\/strong><\/p>\n<div class=\"code-output\">Ttest_1sampResult(statistic=array([-0.82238541, 0.86996127, -0.62452709, -0.40478003, 1.41334689]), pvalue=array([0.43210532, 0.40692533, 0.54778478, 0.69509088, 0.19119386]))<\/div>\n<h2>Summary<\/h2>\n<p>The stats module is a very important feature of SciPy. It is useful for obtaining probabilistic distributions. SciPy Stats can generate discrete or continuous random numbers. It also consists of many other functions to generate descriptive statistical values.<\/p>\n<p>We can deal with random, continuos, and random variables. We have functions for working with various types of distributions. Also, we can perform the T-test on the data to evaluate the mean value. We have descriptive statistics for in-depth operations.<\/p>\n","protected":false},"excerpt":{"rendered":"<p>The SciPy library consists of a package for statistical functions. The scipy.stats is the SciPy sub-package. It is mainly used for probabilistic distributions and statistical operations. There is a wide range of probability functions.&#46;&#46;&#46;<\/p>\n","protected":false},"author":6,"featured_media":82085,"comment_status":"open","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[22401],"tags":[23245,23247,23246],"class_list":["post-78563","post","type-post","status-publish","format-standard","has-post-thumbnail","hentry","category-numpy","tag-scipy-stats","tag-statistical-functions-in-scipy","tag-statistics-in-scipy"],"yoast_head":"<!-- This site is optimized with the Yoast SEO plugin v27.8 - https:\/\/yoast.com\/product\/yoast-seo-wordpress\/ -->\n<title>SciPy Stats - Statistical Functions in SciPy - DataFlair<\/title>\n<meta name=\"description\" content=\"Learn about SciPy Stats - This module contains a large number of probability distributions as well as a growing library of statistical functions.\" \/>\n<meta name=\"robots\" content=\"index, follow, max-snippet:-1, max-image-preview:large, max-video-preview:-1\" \/>\n<link rel=\"canonical\" href=\"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/\" \/>\n<meta property=\"og:locale\" content=\"en_US\" \/>\n<meta property=\"og:type\" content=\"article\" \/>\n<meta property=\"og:title\" content=\"SciPy Stats - Statistical Functions in SciPy - DataFlair\" \/>\n<meta property=\"og:description\" content=\"Learn about SciPy Stats - This module contains a large number of probability distributions as well as a growing library of statistical functions.\" \/>\n<meta property=\"og:url\" content=\"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/\" \/>\n<meta property=\"og:site_name\" content=\"DataFlair\" \/>\n<meta property=\"article:publisher\" content=\"https:\/\/www.facebook.com\/DataFlairWS\/\" \/>\n<meta property=\"article:published_time\" content=\"2020-09-12T03:30:44+00:00\" \/>\n<meta property=\"article:modified_time\" content=\"2021-05-09T07:43:23+00:00\" \/>\n<meta property=\"og:image\" content=\"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg\" \/>\n\t<meta property=\"og:image:width\" content=\"1200\" \/>\n\t<meta property=\"og:image:height\" content=\"628\" \/>\n\t<meta property=\"og:image:type\" content=\"image\/jpeg\" \/>\n<meta name=\"author\" content=\"DataFlair Team\" \/>\n<meta name=\"twitter:card\" content=\"summary_large_image\" \/>\n<meta name=\"twitter:creator\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:site\" content=\"@DataFlairWS\" \/>\n<meta name=\"twitter:label1\" content=\"Written by\" \/>\n\t<meta name=\"twitter:data1\" content=\"DataFlair Team\" \/>\n\t<meta name=\"twitter:label2\" content=\"Est. reading time\" \/>\n\t<meta name=\"twitter:data2\" content=\"4 minutes\" \/>\n<!-- \/ Yoast SEO plugin. -->","yoast_head_json":{"title":"SciPy Stats - Statistical Functions in SciPy - DataFlair","description":"Learn about SciPy Stats - This module contains a large number of probability distributions as well as a growing library of statistical functions.","robots":{"index":"index","follow":"follow","max-snippet":"max-snippet:-1","max-image-preview":"max-image-preview:large","max-video-preview":"max-video-preview:-1"},"canonical":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/","og_locale":"en_US","og_type":"article","og_title":"SciPy Stats - Statistical Functions in SciPy - DataFlair","og_description":"Learn about SciPy Stats - This module contains a large number of probability distributions as well as a growing library of statistical functions.","og_url":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/","og_site_name":"DataFlair","article_publisher":"https:\/\/www.facebook.com\/DataFlairWS\/","article_published_time":"2020-09-12T03:30:44+00:00","article_modified_time":"2021-05-09T07:43:23+00:00","og_image":[{"width":1200,"height":628,"url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg","type":"image\/jpeg"}],"author":"DataFlair Team","twitter_card":"summary_large_image","twitter_creator":"@DataFlairWS","twitter_site":"@DataFlairWS","twitter_misc":{"Written by":"DataFlair Team","Est. reading time":"4 minutes"},"schema":{"@context":"https:\/\/schema.org","@graph":[{"@type":"Article","@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#article","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/"},"author":{"name":"DataFlair Team","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89"},"headline":"SciPy Stats &#8211; Statistical Functions in SciPy","datePublished":"2020-09-12T03:30:44+00:00","dateModified":"2021-05-09T07:43:23+00:00","mainEntityOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/"},"wordCount":725,"commentCount":0,"publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg","keywords":["scipy stats","Statistical Functions in SciPy","Statistics in SciPy"],"articleSection":["NumPy Tutorials"],"inLanguage":"en-US","potentialAction":[{"@type":"CommentAction","name":"Comment","target":["https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#respond"]}]},{"@type":"WebPage","@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/","url":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/","name":"SciPy Stats - Statistical Functions in SciPy - DataFlair","isPartOf":{"@id":"https:\/\/data-flair.training\/blogs\/#website"},"primaryImageOfPage":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#primaryimage"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#primaryimage"},"thumbnailUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg","datePublished":"2020-09-12T03:30:44+00:00","dateModified":"2021-05-09T07:43:23+00:00","description":"Learn about SciPy Stats - This module contains a large number of probability distributions as well as a growing library of statistical functions.","breadcrumb":{"@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#breadcrumb"},"inLanguage":"en-US","potentialAction":[{"@type":"ReadAction","target":["https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/"]}]},{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#primaryimage","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2020\/09\/SciPy-Statstics.jpg","width":1200,"height":628,"caption":"SciPy Stats"},{"@type":"BreadcrumbList","@id":"https:\/\/data-flair.training\/blogs\/scipy-statistical-functions\/#breadcrumb","itemListElement":[{"@type":"ListItem","position":1,"name":"Blog Home","item":"https:\/\/data-flair.training\/blogs\/"},{"@type":"ListItem","position":2,"name":"NumPy Tutorials","item":"https:\/\/data-flair.training\/blogs\/category\/numpy\/"},{"@type":"ListItem","position":3,"name":"SciPy Stats &#8211; Statistical Functions in SciPy"}]},{"@type":"WebSite","@id":"https:\/\/data-flair.training\/blogs\/#website","url":"https:\/\/data-flair.training\/blogs\/","name":"DataFlair","description":"Learn Today. Lead Tomorrow.","publisher":{"@id":"https:\/\/data-flair.training\/blogs\/#organization"},"potentialAction":[{"@type":"SearchAction","target":{"@type":"EntryPoint","urlTemplate":"https:\/\/data-flair.training\/blogs\/?s={search_term_string}"},"query-input":{"@type":"PropertyValueSpecification","valueRequired":true,"valueName":"search_term_string"}}],"inLanguage":"en-US"},{"@type":"Organization","@id":"https:\/\/data-flair.training\/blogs\/#organization","name":"DataFlair","url":"https:\/\/data-flair.training\/blogs\/","logo":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/","url":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","contentUrl":"https:\/\/data-flair.training\/blogs\/wp-content\/uploads\/sites\/2\/2016\/07\/Data-Flair.png","width":106,"height":48,"caption":"DataFlair"},"image":{"@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/logo\/image\/"},"sameAs":["https:\/\/www.facebook.com\/DataFlairWS\/","https:\/\/x.com\/DataFlairWS","https:\/\/www.linkedin.com\/company\/dataflair-web-services-pvt-ltd\/","https:\/\/www.youtube.com\/user\/DataFlairWS"]},{"@type":"Person","@id":"https:\/\/data-flair.training\/blogs\/#\/schema\/person\/2c58ecb4f73a39f0ef993f1ddfcd7b89","name":"DataFlair Team","image":{"@type":"ImageObject","inLanguage":"en-US","@id":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","url":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","contentUrl":"https:\/\/secure.gravatar.com\/avatar\/1ce4a0e3e542444fc73bbebf83e89e8b73e2d95ccb1fcee64da9945f078b97c5?s=96&d=mm&r=g","caption":"DataFlair Team"},"description":"The DataFlair Team provides industry-driven content on programming, Java, Python, C++, DSA, AI, ML, data Science, Android, Flutter, MERN, Web Development, and technology. Our expert educators focus on delivering value-packed, easy-to-follow resources for tech enthusiasts and professionals.","url":"https:\/\/data-flair.training\/blogs\/author\/dfteam2\/"}]}},"amp_enabled":true,"_links":{"self":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/78563","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/users\/6"}],"replies":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/comments?post=78563"}],"version-history":[{"count":5,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/78563\/revisions"}],"predecessor-version":[{"id":93194,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/posts\/78563\/revisions\/93194"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media\/82085"}],"wp:attachment":[{"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/media?parent=78563"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/categories?post=78563"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/data-flair.training\/blogs\/wp-json\/wp\/v2\/tags?post=78563"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}