R Data Reshaping – Reshape Function and Reshape Package


1. Objective

In this tutorial, we will be going to discuss what is Data Reshaping in R and how to reshape data in R. We will also cover data frame concepts in R as R data reshaping is totally dependent on data frame. Along with this, we will also learn different properties of a data frame which will help you out in learning data reshaping concepts.

R data reshaping

2. Introduction to R Data Reshaping

R data reshaping is about changing the way data is organized into rows and columns. Most of the time data processing in R is done by taking the input data as a data frame. Also, to extract data from the rows and columns of a data frame is an easy task but there are situations when we need the data frame in a format that is different from the format in which we received it. In R, it has many functions to split, merge and change the rows to columns in a data frame.

3. Why Reshape R Package?

For analytic functions, the data obtained as a result of an experiment or study is generally different.Generally, the data from a study has one or more columns that can identify a row followed by a number of columns that represent the values measured. The columns that identify the row can be thought of as composite key of a database column.

To understand this concept, knowledge of matrix is also necessary.So, to learn matrix you can follow the below mentioned link:

R Matrix and R Matrix Function

4. Joining Columns and Rows in a Data Frame

We use vectors to create a data frame using the cbind()function. Also, we can merge two data frames using rbind() function.

a.cbind()

We use cbind() function to combines vector, matrix or dataframe by columns.

cbind(x1,x2,…)

x1,x2: vector, matrix, data frames

data1.csv:

Subtype   Gender   Expression

 B                M          -0.54

 A                F           -0.8

data2.csv:

Age   City

32      Indore

25      Mumbai

Read in the data from the file:

>x <- read.csv(“data1.csv”,header=T,sep=”,”)

>x2 <- read.csv(“data2.csv”,header=T,sep=”,”)

>x3 <- cbind(x,x2)

>x3

         Subtype      Gender         Expression      Age         City

1             B               M               -0.54           32          Indore

2             A               F               -0.8              25          Mumbai

The row number of the two data sets must be equal.

b.rbind()

We use rbind() function to combine vector, matrix or dataframe by rows.

rbind(x1,x2,…)

x1,x2: vector, matrix, data frames

data1.csv:

Subtype   Gender   Expression

B                M             -0.54

A                 F              -0.8

data2.csv:

Subtype   Gender     Expression

D                 F             3.22

D                M              1.02

Read in the data from the file:

>x <- read.csv(“data1.csv”,header=T,sep=”,”)

>x2 <- read.csv(“data2.csv”,header=T,sep=”,”)

>x3 <- rbind(x,x2)

>x3

          Subtype    Gender     Expression

1               B             M             -0.54

2               A             F              -0.8

3               D             F              3.22

4               D             M             1.02

The column of the two data sets must be same, otherwise, the combination will be meaningless.

It’s also necessary to understand the concept of a vector function.

5. Merging Data Frames in R

merge() function is used to merge two data frames. The data frames must have same column names on which the merging happens.

Adding Columns

To merge two data frames (datasets) horizontally, use the merge function. Mostly, we use to join two data frames by one or more common key variables (i.e., an inner join).

# merge two data frames by ID

total <- merge(data frameA,data frameB,by=”ID”)

# merge two data frames by ID and Country

total <- merge(data frameA,data frameB,by=c(“ID”,”Country”)) .

6. Conclusion

We have studied how to reshape data in R in a detailed manner.We have learned about data frame and its properties also as R data frame is an important part of data reshaping in R. Data frame helps you in every single concept of data reshaping as they are linked to data frame only.

Leave a comment

Your email address will not be published. Required fields are marked *