## 1. Objective

In this tutorial, we will be going to learn what is T-Tests in R and why we use R t-tests. Along with this, we will also learn how to perform t-tests in R and various uses of T-Test in R. We will also learn about various types of t-test in R like one sample t-test, Welch t-test etc.

## 2. Introduction

T-tests in R is one of the most common tests in statistics. It is being used to determine whether the means of two groups are equal to each other. The assumption for the test is that both groups are sampled from normal distributions with equal variances. The null hypothesis is that the two means are equal, and the alternative is that they are not. It is being known that under the null hypothesis, we can calculate a t-statistic that will follow a t-distribution with n1 + n2 – 2 degrees of freedom. Welch’s t-test is a user modification of the t-test that adjusts the number of degrees of freedom when the variances are thought not to be equal to each other.

We use t.test() which provides a variety of t-tests.

**# independent 2-group t-test**

t.test(y~x) # where y is numeric and x is a binary factor

**# independent 2-group t-test**

t.test(y1,y2) # where y1 and y2 are numeric

**# paired t-test**

t.test(y1,y2,paired=TRUE) # where y1 & y2 are numeric

**# one sample t-test**

t.test(y,mu=3) # Ho: mu=3

## 3. How to perform t-tests in R

We can use the var.equal = TRUE option to specify equal variances and a pooled variance estimate.

You can use them –

alternative=”less” or

alternative=”greater”, option to specify one-tailed test.

**a. One-Sample T-Tests in R**

In R, we use the syntax **t.test(y, mu = 0) **to conduct one-sample tests in R, where

X: is the name of our variable of interest and

Mu: is set equal to the mean specified by the null hypothesis.

**For Example**:

If we wanted to test whether the volume of a shipment of lumber was less than usual (*μ*0=39000 cubic feet), we would run:

set.seed(0)

treeVolume <- c(rnorm(75, mean = 36500, sd = 2000))

t.test(treeVolume, mu = 39000) # Ho: mu = 39000

**One Sample t-test**

**data**: treeVolume

t = -12.2883, df = 74, p-value < 2.2e-16

alternative hypothesis: true mean is not equal to 39000

95 percent confidence interval:

36033.60 36861.38

**sample estimates**:

mean of x

36447.49

**b. Paired sample T-Tests in R**

We need either two vectors of data, y1 and y2, to conduct a paired-samples test. Then we will run this code using this using **syntax t.test(y1, y2, paired=TRUE).**

For instance, let’s say that we work at a large health clinic and we’re testing a new drug, Procardia, that’s meant to reduce hypertension. We find 1000 individuals with a high systolic blood pressure (x¯=145mmHg, SD=9mmHg), we give them Procardia for a month, and then measure their blood pressure again. We find that the mean systolic blood pressure has decreased to 138mmHg with a standard deviation 8mmHg.

**Here, we would conduct a t-test using**:

set.seed(2820)

preTreat <- c(rnorm(1000, mean = 145, sd = 9))

postTreat <- c(rnorm(1000, mean = 138, sd = 8))

t.test(preTreat, postTreat, paired = TRUE)

Paired t-test

**data**: preTreat and postTreat

t = 19.7514, df = 999, p-value < 2.2e-16

alternative hypothesis: true difference in means is not equal to 0

**95 percent confidence interval**:

6.703959 8.183011

**sample estimates**:

mean of the differences

7.443485

Again, we see that there is a statistically significant difference in means of

t = 19.7514, p-value < 2.2e-16

**c. Independent Samples**

The independent-samples test can take one of three forms, depending on the structure of your data and the equality of their variances. The general form of the test is t.test(y1, y2, paired=FALSE). By default, R assumes that the variances of y1 and y2 are unequal, thus defaulting to Welch’s test. To toggle this, we use the flag var.equal=TRUE.

In the three examples shown here, we’ll test the hypothesis that Clevelanders and New Yorkers spend different amounts monthly eating out.

**Independent-samples t-test where y1 and y2 are numeric**:

set.seed(0)

ClevelandSpending <- rnorm(50, mean = 250, sd = 75)

NYSpending <- rnorm(50, mean = 300, sd = 80)

t.test(ClevelandSpending, NYSpending, var.equal = TRUE)

Two Sample t-test

**data**: ClevelandSpending and NYSpending

t = -3.6361, df = 98, p-value = 0.0004433

**alternative hypothesis**: true difference in means is not equal to 0

95 percent confidence interval:

-77.1608 -22.6745

**sample estimates**:

mean of x mean of y

251.7948 301.7125

**Where y1 is numeric and y2 is binary**:

spending <- c(ClevelandSpending, NYSpending)

city <- c(rep(“Cleveland”, 50), rep(“New York”, 50))

t.test(spending ~ city, var.equal = TRUE)

Two Sample t-test

**data**: spending by city

t = -3.6361, df = 98, p-value = 0.0004433

**alternative hypothesis**: true difference in means is not equal to 0

**95 percent confidence interval**:

-77.1608 -22.6745

**sample estimates**:

mean in group Cleveland mean in group New York

251.7948 301.7125

**With equal variances not assumed**:

t.test(ClevelandSpending, NYSpending, var.equal = FALSE)

Welch Two Sample t-test

**data**: ClevelandSpending and NYSpending

t = -3.6361, df = 97.999, p-value = 0.0004433

**alternative hypothesis**: true difference in means is not equal to 0

95 percent confidence interval:

-77.1608 -22.6745

**sample estimates**:

mean of x mean of y

251.7948 301.7125

In each case, we see that the results really don’t differ substantially: our simulated data show that in any case, New Yorkers spend more each month at restaurants than Clevelanders do. However, should you want to test for equality of variances in your data prior to running an independent-samples t-test, R offers an easy way to do so with the var.test() function:

var.test(ClevelandSpending, NYSpending)

F test to compare two variances

**data**: ClevelandSpending and NYSpending

F = 1.0047, num df = 49, denom df = 49, p-value = 0.9869

alternative hypothesis: true ratio of variances is not equal to 1

**95 percent confidence interval**:

0.5701676 1.7705463

**sample estimates**:

ratio of variances

1.004743

## 4. Uses of T-Tests

**a. What is t-test in R used for?**

It is an analysis of two populations means the use of statistical examination. It is a type of t-test with two samples is being used with small sample sizes. And testing the difference between the samples when the variances of two normal distributions are not known.

**b. What is Welch’s t-test used for?**

In statistics, we use welch’s t-test, which is a two-sample location test. we use it to test the hypothesis that two populations have equal means. Welch’s t-test in R is a type of test which we used in an adaptation of Student’s t-test. It is more reliable when the two samples have unequal variances and unequal sample sizes.

**c. What is a one sample t-test used for?**

We use it only for tests of the sample mean.

**d. Why do we use the t-test for research?**

We use PowerPoint on t-tests which have been made for our use. The t-test is one type of inferential statistics. we use it to determine whether there is a difference between the means of two groups. With all inferential statistics, we assume the dependent variable fits a normal distribution.

**5. Conclusion**

We have learned t-test in deep. Along with it, we have also learned how to perform different t-tests. We have also studied in this tutorial about the uses of the t-test in R.