T-Tests in R – Uses and Types | Welch T-Test & One Sample T-Test
Today, in this tutorial, we will be going to learn what is T-Tests in R. Along with this, we will learn how to perform t-tests in R and various uses of T-Test in R. Also, we will look at various types of t-test in R like one sample t-test, Welch t-test etc.
So, let’s start T-tests in R Tutorial.
2. What is T-tests in R Programming?
T-tests in R is one of the most common tests in statistics. So, we use it to determine whether the means of two groups are equal to each other. The assumption for the test is that both groups are sampled from normal distributions with equal variances. The null hypothesis is that the two means are equal, and the alternative is that they are not. It is being known that under the null hypothesis, we can calculate a t-statistic that will follow a t-distribution with n1 + n2 – 2 degrees of freedom. Welch’s t-test is a user modification of the t-test that adjusts the number of degrees of freedom when the variances are thought not to be equal to each other.
We use t.test() which provides a variety of t-tests.
# independent 2-group t-test
t.test(y~x) # where y is numeric and x is a binary factor
# independent 2-group t-test
t.test(y1,y2) # where y1 and y2 are numeric
# paired t-test
t.test(y1,y2,paired=TRUE) # where y1 & y2 are numeric
# one sample t-test
t.test(y,mu=3) # Ho: mu=3
3. How to Perform T-tests in R?
We can use the var.equal = TRUE option to specify equal variances and a pooled variance estimate.
You can use them –
alternative=”greater”, option to specify one-tailed test.
a. One-Sample T-Tests in R
In R, we use the syntax t.test(y, mu = 0) to conduct one-sample tests in R, where
X: is the name of our variable of interest and
Mu: is set equal to the mean specified by the null hypothesis.
If we wanted to test whether the volume of a shipment of lumber was less than usual (μ0=39000 cubic feet), we would run:
set.seed(0) treeVolume <- c(rnorm(75, mean = 36500, sd = 2000)) t.test(treeVolume, mu = 39000) # Ho: mu = 39000 One Sample t-test data: treeVolume t = -12.2883, df = 74, p-value < 2.2e-16 alternative hypothesis: true mean is not equal to 39000 95 percent confidence interval: 36033.60 36861.38 sample estimates: mean of x 36447.49
b. Paired sample T-Tests in R
We need either two vectors of data, y1 and y2, to conduct a paired-samples test. Then we will run this code using this using syntax t.test(y1, y2, paired=TRUE).
For instance, let’s say that we work at a large health clinic and test a new drug, Procardia, this means to reduce hypertension. We find 1000 individuals with a high systolic blood pressure (x¯=145mmHg, SD=9mmHg), we give them Procardia for a month, and then measure their blood pressure again. We find that the mean systolic blood pressure has decreased to 138mmHg with a standard deviation 8mmHg.
Here, we would conduct a t-test using:
set.seed(2820) preTreat <- c(rnorm(1000, mean = 145, sd = 9)) postTreat <- c(rnorm(1000, mean = 138, sd = 8)) t.test(preTreat, postTreat, paired = TRUE) Paired t-test data: preTreat and postTreat t = 19.7514, df = 999, p-value < 2.2e-16 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: 6.703959 8.183011 sample estimates: mean of the differences 7.443485
Again, we see that there is a statistically significant difference in means of
t = 19.7514, p-value < 2.2e-16
c. Independent Samples
The independent-samples test can take one of three forms, depending on the structure of your data and the equality of their variances. The general form of the test is t.test(y1, y2, paired=FALSE). By default, R assumes that the variances of y1 and y2 are unequal, thus defaulting to Welch’s test. To toggle this, we use the flag var.equal=TRUE.
In the three examples shown here, we’ll test the hypothesis that Clevelanders and New Yorkers spend different amounts monthly eating out.
Independent-samples t-test where y1 and y2 are numeric:
set.seed(0) ClevelandSpending <- rnorm(50, mean = 250, sd = 75) NYSpending <- rnorm(50, mean = 300, sd = 80) t.test(ClevelandSpending, NYSpending, var.equal = TRUE) Two Sample t-test data: ClevelandSpending and NYSpending t = -3.6361, df = 98, p-value = 0.0004433 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -77.1608 -22.6745 sample estimates: mean of x mean of y 251.7948 301.7125
Where y1 is numeric and y2 is binary:
spending <- c(ClevelandSpending, NYSpending) city <- c(rep("Cleveland", 50), rep("New York", 50)) t.test(spending ~ city, var.equal = TRUE) Two Sample t-test data: spending by city t = -3.6361, df = 98, p-value = 0.0004433 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -77.1608 -22.6745 sample estimates: mean in group Cleveland mean in group New York 251.7948 301.7125
With equal variances not assumed:
t.test(ClevelandSpending, NYSpending, var.equal = FALSE) Welch Two Sample t-test data: ClevelandSpending and NYSpending t = -3.6361, df = 97.999, p-value = 0.0004433 alternative hypothesis: true difference in means is not equal to 0 95 percent confidence interval: -77.1608 -22.6745 sample estimates: mean of x mean of y 251.7948 301.7125
In each case, we see that the results really don’t differ substantially: our simulated data show that in any case, New Yorkers spend more each month at restaurants than Clevelanders do. However, should you want to test for equality of variances in your data prior to running an independent-samples t-test, R offers an easy way to do so with the var.test() function:
var.test(ClevelandSpending, NYSpending) F test to compare two variances data: ClevelandSpending and NYSpending F = 1.0047, num df = 49, denom df = 49, p-value = 0.9869 alternative hypothesis: true ratio of variances is not equal to 1 95 percent confidence interval: 0.5701676 1.7705463 sample estimates: ratio of variances 1.004743
4. Uses of T-Tests in R
a. Why we use a t-test in R?
It is an analysis of two populations means the use of statistical examination. It is a type of t-test with two samples is being used with small sample sizes. And testing the difference between the samples when the variances of two normal distributions are not known.
b. What is Welch’s t-test used for?
In statistics, we use welch’s t-test, which is a two-sample location test. we use it to test the hypothesis that two populations have equal means. Welch’s t-test in R is a type of test which we used in an adaptation of Student’s t-test. It is more reliable when the two samples have unequal variances and unequal sample sizes.
c. Why we use one sample t-test?
We use it only for tests of the sample mean.
d. Why do we use the t-test for research?
We use PowerPoint on t-tests which have been made for our use. The t-test is one type of inferential statistics. we use it to determine whether there is a difference between the means of two groups. With all inferential statistics, we assume the dependent variable fits a normal distribution.
So, this was all in T-Tests in R. Hope you like our explanation.
5. Conclusion – T-Test in R
Hence, we have learned the t-tests in R. Along with it, we have discussed how to perform different t-tests in R. Also, we have studied the uses of the t-tests in R. Moreover, we discussed the independent t-test, paired sample t-test, and one sample t-tests in R. Still, if you have any query regarding the T-tests in R, ask in comment tab.