RStudio Tutorial – How to Import & Transform Data?
1. RStudio Tutorial
RStudio is one of the most popular IDE for working with the R language. Before we begin working with RStudio, we will understand its definition, data analysis, data import as well as data transformation. We will also provide you with several screenshots that will give you a lucid understanding of R.
Let us now start our R Studio Tutorial.
2. What is R Studio?
R is an open source integrated development environment that facilitates statistical modeling as well as graphical capabilities for R. It makes use of the QT framework for its GUI features. There are two versions of RStudio – RStudio Desktop and RStudio Server. RStudio desktop provides facilities for working on the local desktop environment whereas RStudio Server provides access through a web browser. RStudio is open source IDE (integrated development environment) for R programming. In order to first dive into RStudio tutorial, let us first learn about its installation –
3. Basic Data Analysis Through R/R Studio
- Downloading/importing data in R
- Data Transformation and other miscellaneous data operations.
- Performing statistical modeling on the data
- Creating graphical plots of data.
4. How to Import Data in R Studio?
> ACS_data <- read.csv(url("http://stat511.cwick.co.nz/homeworks/acs_or.csv"))
5. How to Transform Data by RStudio?
> ACS_data$age_husband #Author DataFlair
In order to access the data as a vector –
> ACS_data[1,2] #Author DataFlair
If you want to retrieve the data for which age_husband is greater than the label age_wife, then we will execute the following command –
> greater <- subset(ACS_data , age_husband > age_wife) #Author DataFlair > head(greater)
We can perform various statistical operations as follows:
> mean(ACS_data$age_husband) #For calculating mean of column > median(ACS_data$age_husband) #For calculating median of column > quantile(ACS_data$age_wife) #For calculating the Quantile > var(ACS_data$age_wife) #For measuring the variance > sd(ACS_data$age_wife) #For calculating the Standard Deviance > #DataFlair
For retrieving summary of the dataset, we use the summary() function –
> summary(ACS_data) #Author DataFlair
> sub <- ACS_data[1:100, ] #Author DataFlair > plot(x = sub$age_husband , y = sub$age_wife, type = 'p')
We first created a subset called ‘sub’ that contains only 100 rows. We will plot the variables that are pertaining to these 100 rows.
> get_table <- table(ACS_data$bedrooms) > barplot(get_table, main="Bedrooms Distribution", xlab="Bedroom Count") #Author DataFlair
7. Conclusion – R Studio Tutorial
In this tutorial, we studied about R Studio. We learnt about the basics of R Studio. How to import data, transform it, perform analysis on the data and finally, visualize the data. We hope that you understood a lot about RStudio with this blog. If there are any queries, please leave them in the comment section.
See also- R Hadoop Integration