RStudio Tutorial – How to Import & Transform Data?

1. RStudio Tutorial

RStudio is one of the most popular IDE for working with the R language. Before we begin working with RStudio, we will understand its definition, data analysis, data import as well as data transformation. We will also provide you with several screenshots that will give you a lucid understanding of R.

Let us now start our R Studio Tutorial.

R Studio | The Best RStudio Tutorial

R Studio | The Best RStudio Tutorial

2. What is R Studio?

R is an open source integrated development environment that facilitates statistical modeling as well as graphical capabilities for R. It makes use of the QT framework for its GUI features. There are two versions of RStudio – RStudio Desktop and RStudio Server. RStudio desktop provides facilities for working on the local desktop environment whereas RStudio Server provides access through a web browser. RStudio is open source IDE (integrated development environment) for R programming. In order to first dive into RStudio tutorial, let us first learn about its installation –

Read more about RStudio Installation Process

3. Basic Data Analysis Through R/R Studio

We will perform data analysis using RStudio in this section. We will also perform data transformation as well as graphical plotting of the resulting data distribution.
  • Downloading/importing data in R
  • Data Transformation and other miscellaneous data operations.
  • Performing statistical modeling on the data
  • Creating graphical plots of data.

4. How to Import Data in R Studio?

In order to deploy our model in RStudio, we will make use of the ACS dataset. We can import this data through the following command that is typed in the console window.
> ACS_data <- read.csv(url("http://stat511.cwick.co.nz/homeworks/acs_or.csv"))
After this command is executed by RStudio, the entire ACS dataset will be loaded into the ACS_data object in the form of a csv file.

5. How to Transform Data by RStudio?

After you have imported data into your variable in RStudio, you can now apply various transformations to manipulate the data. Some techniques for accessing the data are as follows:
In order to access the label age_husband, we use –
> ACS_data$age_husband   #Author DataFlair

In order to access the data as a vector –

To access data as a vector
> ACS_data[1,2]    #Author DataFlair

If you want to retrieve the data for which age_husband is greater than the label age_wife, then we will execute the following command –

> greater <- subset(ACS_data , age_husband > age_wife) #Author DataFlair
> head(greater)
In the above code, we use the function ‘subset’ and within this function, we specify our data upon which the greater than operation is to be applied. The second parameter performs this codition. Finally, we display the first 6 values that return the values for which age_husband is greater than age_wife.

We can perform various statistical operations as follows:

> mean(ACS_data$age_husband)  #For calculating mean of column
> median(ACS_data$age_husband) #For calculating median of column 
> quantile(ACS_data$age_wife)  #For calculating the Quantile
> var(ACS_data$age_wife) #For measuring the variance
> sd(ACS_data$age_wife)  #For calculating the Standard Deviance 
> #DataFlair

For retrieving summary of the dataset, we use the summary() function –

> summary(ACS_data)   #Author DataFlair
R Quiz

6. Plotting Data by RStudio

RStudio provides advanced graphic visualization features. We can plot our above data with the column label ‘age_husband’ on the x axis and column age_wife on the y axis.
We can plot the scatterplot as follows:
> sub <- ACS_data[1:100, ] #Author DataFlair
> plot(x = sub$age_husband , y = sub$age_wife, type = 'p')

We first created a subset called ‘sub’ that contains only 100 rows. We will plot the variables that are pertaining to these 100 rows.

In order to delineate a histogram, we make use of the following command –
hist(ACS_data$number_children)
We can plot the barplots as follows:
> get_table <- table(ACS_data$bedrooms)
> barplot(get_table, main="Bedrooms Distribution", xlab="Bedroom Count")  #Author DataFlair

7. Conclusion – R Studio Tutorial

In this tutorial, we studied about R Studio. We learnt about the basics of R Studio. How to import data, transform it, perform analysis on the data and finally, visualize the data. We hope that you understood a lot about RStudio with this blog. If there are any queries, please leave them in the comment section.

See also- R Hadoop Integration

For reference

1 Response

  1. Niaz HUssain Ghumro says:

    How to perform Quantile REgression in R Studio?

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.