RStudio Tutorial – Importing and Transforming Data 2018

1. RStudio Tutorial

In this RStudio Tutorial, we will study What is R Studio. Moreover, we will learn the basic Data Analysis through R/R Studio, importing Data in RStudio, transforming Data by R Studio, and Plotting Data by RStudio. Also, we will learn the R Studio process step-by-step. Along with this, we will use screenshots also that gives you a clear understanding.

So, Lets, begin with RStudio Tutorial.

R Studio | The Best RStudio Tutorial of 2018

R Studio | The Best RStudio Tutorial of 2018

2. What is R Studio?

Here we will discuss the simple steps to install R for R programming. Before installing R, we have to install R Studio first, So lets now learn R Studio installation in our RStudio Tutorial.
Read more about RStudio Installation Process

3. Basic Data Analysis Through R/R Studio

In this RStudio tutorial, we’ll design a basic data analysis program in R using R Studio. Also, by utilizing the features of R Studio to create some representation of that data. Following steps will be performed to achieve our goal.
  • Downloading/importing data in R
  • Transforming Data/Running queries on data
  • Basic data analysis using statistical averages
  • Plotting data distribution

4. Importing Data in R Studio

For this RStudio tutorial, we will use the sample census dataset ACS. There are two ways to import this data into R.
One way is to import the data by executing the following command in the console window of R Studio.
acs <- read.csv(url(“http://stat511.cwick.co.nz/homeworks/acs_or.csv”))
Once this command is executed by pressing Enter. The dataset will be downloaded from the internet, read as a CSV file and assigned to the variable name acs.
R Studio Tutorial

RStudio Tutorial – Download Dataset

The second way to import the dataset into R Studio is to first download it onto your local computer. And then use the import dataset feature of RStudio. To perform this follow the steps below
a. Click on the import dataset button in the top-right section under the environment tab. Select the file you want to import and then click open. The import Dataset dialog will appear as shown below
R Studio Tutorial

RStudio Tutorial- Import Dataset

b. After setting up the preferences of separator, name and other parameters. Then need to click on the Import button. The dataset will need to import in R Studio and assigned to the variable name as set before.
Any dataset can be viewed by executing the following line:
View(acs)
where acs is the variable dataset is assigned to.

5. Transforming Data by RStudio

Once you are done with importing the data in R Studio. Then you can use various transformation features of R to manipulate the data. Let’s learn few of the basic data access techniques
To access a particular column, Ex. age_husband in our case.
acs$age_husband
To access data as a vector
acs[1,3]
To run some queries on data, you can use the subset function of R.
Let’s say I want those rows from the dataset in which the age_husband is greater than age_wife. For this, we ‘ll run the following command in console
a <- subset(acs , age_husband > age_wife)
The first parameter to the subset function is the dataframe you want to apply that function to and the second parameter is the boolean condition that needs to be checked for each row to be included or not. So the above statement will return the set the rows. As particularly in which the age_husband is greater than age_wife and assign those rows to a
Getting statistical averages from data
Following functions can be used to calculate the averages of the dataset
  • For mean of any column, run : mean(acs$age_husband)
  • Median, run : median(acs$age_husband)
  • Quantile , run : quantile(acs$age_wife)
  • Variance , run : var(acs$age_wife)
  • Standard Deviation , run : sd(acs$age_wife)
R studio Tutorial

Rstudio Tutorial – Statistical Actions

You can also get the statistical summary of the dataset. That is by just running on either a column or the complete dataset
summary(acs)

6. Plotting Data by RStudio

Basically, a very liked feature of Rstudio is its built-in data visualizer for R.
Any data set imported in R can visualize using the plot and several other functions of R. For Example
To create a scatter plot of a data set, you can run the following command in console
plot(x = s$age_husband , y = s$age_wife, type = ‘p’)
Wheres is the subset of the original dataset and type ‘p’ set the plot type as a point. You can also choose a line and other change type variable to ‘L’ etc.
R Studio Tutorial

Rstudio Tutorial – Plotting Data

For data distribution plots, there are several features tools and packages available. That you can use to draw any kind of distribution. For example
To draw a Histogram of a dataset, you can run the command
hist(acs$number_children)
R Studio Tutorial

RStudio Tutorial-To draw a Histogram of a dataset

Similarly, for Bar Plots, run the following set of commands
counts <- table(acs$bedrooms)
barplot(counts, main=”Bedrooms Distribution”, xlab=”Number of Bedrooms”)
R Studio Tutorial

RStudio Tutorial – To draw Bar Plot

So, this was all for RStudio Tutorial. Hope you like our explanation.

7. Conclusion – R Studio Tutorial

Hence, we have studied RStudio Tutorial along with Introduction to R Studio. Also, we discussed basic Data Analysis through R/RStudio, Importing Data in R Studio, and transforming Data by RStudio. Moreover, we learned about plotting data by R Studio. Also, have learned all the steps along with screenshots. Thus, it helps you in a better understanding of R Studio Tutorial. Furthermore, if you have any query regarding R Studio, feel free to ask in the comment section.

See also- R Hadoop Integration

For reference

Leave a Reply

Your email address will not be published. Required fields are marked *