Input-Output Features in R Programming


1. Objective

In this R tutorial, we will learn about various input-output features in R Language and its implementation within R programming. In this blog we will learn how to accessing the keyboard and monitor, Reading and Writing files, Introduction to Connections, TCP/IP, protocols. We will also learn about different functions which we used in it.

Introduction to Input-Output Features in R Programming

2. Introduction to Input-Output Features in R

Let’s now discuss the different input-output features in R programming.

2.1. Accessing the keyboard and monitor

In R, there are a series of functions that can be used to request an input from the user, including readline(), cat(), and scan(). But, I find the readline() function to be the optimal function for this task.

a) Reading from the keyboard

To read the data from the keyboard we use three different functions: scan(), readline(), print().

i) scan() –

Read Data Values: Read data into a vector or list from the console or file.

Keywords: File, connection

For Example:

> z <- scan()
1: 12 5
3: 2
4:
Read 3 items
> z
[1] 12 5 2
ii) readline()

Read Text Lines from a Connection: Read some or all text lines from a connection.

Keywords: File, connection

We can use readline() for inputing a line from the keyboard in the form of a string:

For Example:

> w <- readline()
xyz vw u
> w
[1] "xyz vw u"
iii) print()

Print Values: print, prints its argument and returns it. It is a generic function which means that new printing methods can be easily added for new classes.

Keywords: print

printing to the screen: In interactive mode, one can print the value of that variable by just typing the variable name or expression. In batch mode, one can use the print() function, e.g.

print(x)

The argument might be an object. So it is a little better to use cat() instead of print(), as the last one can print only one expression and its result is numbered, which may be a nuisance to us. Here is an example written below:

> print ("xyz")
[1] "xyz"
> cat("xyz \n")
xyz

The arguments to cat() will be printed out with intervening spaces, for instance

> g <- 62
> cat(g,"xyz","gs\n")
62 xyz gs

2.2. Reading and Writing files

There are so many methods to read and write files in R programming:

a) Reading a data or matrix from a file

Usually we use function read.table() to read data. The default value of a header is ‘FALSE’ and hence when we do not have a header, we need not say such. R factors are also called as character strings. For turning this “feature” off, you can include the argument as.is=T in your call to read.table().

When you have a spreadsheet export file, i.e. having a type.csv where the fields are divided by commas in place of spaces, use read.csv() in place of read.table(). To read spreadsheet files we can use read.xls.

When you read in a matrix using read.table(), the resultant object will become a data frame, even when all the entries got to be numeric. A case exists which may followup call towards as.matrix() in a matrix.

For example:

let say the matrix x becomes:

1 0 1
1 1 1
1 1 0
1 1 0
0 0 1

We need to read it into a matrix form like this

> x <- matrix(scan("x"),nrow=5,byrow=T)

b) Reading a single File One Line at a Time

We can use readLines() for this, but we need to produce a connection first, by calling the file().

For Example:

> c <- file("z","r")
> readLines(c,n=1)
[1] "1 3"

c) Writing a Table to a File

In R, write.table() function is being used to work. It is same as read.table() and which writes a data frame instead of reading one. In the case of writing a matrix, to a file, we need not have to know row or column names like:

write.table(xc,"xcnew",row.names=F,col.names=F)

2.3. Introduction to Connections

Functions to Manipulate Connections (Files, URLs,…). Functions to create, open and close connections.

For example: “generalized files”, such as compressed files, URLs, pipes, etc.

Keywords: file, connection

Extended Example: Reading PUMS sample files

These files contain records for a sample of housing units with information on the characteristics of each unit.

a) Why use PUMS?

PUMS files provide greater accessibility to inexpensive data for research projects. Thus, it is beneficial for students as they are looking for greater accessibility to inexpensive data. Social scientists often use the PUMS for regression analysis and modeling applications.

b) How can I access PUMS?

Statistical software is a tool used to work with PUMS files.

2.4. Writing to a file

We use write.csv() to write files. By default, write.csv() includes row names.

# A sample data frame
data <- read.table(header=TRUE, text='
subject sex size
1 M 7
2 F NA
3 F 9
4 M 11
')
# Write to a file, suppress row names
write.csv(data, "data.csv", row.names=FALSE)
# Same, except that instead of "NA", output blank cells
write.csv(data, "data.csv", row.names=FALSE, na="")
# Use tabs, suppress row names and column names
write.table(data, "data.csv", sep="\t", row.names=FALSE, col.names=FALSE)

2.5. File and Directory Information

Merge all files in a directory using R into a single data frame.

a) Set the directory

setwd("target_dir/")

b) Getting a list of files in a directory

file_list <- list.files()

If we want to list the files in a different directory, specify the path to list.files.

For example:

if we want the files in the folder C:/foo/, we can use the following code:

file_list <- list.files("C:/foo/")

c) Merging the files into a single data frame

The final step is to iterate through the list of files in the current working directory and put them together to form a data frame. When the script encounters the first file in the file_list, it creates the main data frame to merge everything into (called dataset here). This is done using the !exists conditional:

  • If a dataset already exists, then a temporary data frame, called temp_dataset is created and added to the dataset. The temporary data frame is being removed when we’re done with it using the rm(temp_dataset) command.
  • If dataset doesn’t exist (!exists is true), then we have to create it.

Here’s the remainder of the code:

for (file in file_list){
# if the merged dataset doesn't exist, create it
if (!exists("dataset")){
dataset <- read.table(file, header=TRUE, sep="\t")
}
# if the merged dataset does exist, append to it
if (exists("dataset")){
temp_dataset <-read.table(file, header=TRUE, sep="\t")
dataset<-rbind(dataset, temp_dataset)
rm(temp_dataset)
}
}

The full code:

setwd("target_dir/")
file_list <- list.files()
for (file in file_list){
# if the merged dataset doesn't exist, create it
if (!exists("dataset")){
dataset <- read.table(file, header=TRUE, sep="\t")
}
# if the merged dataset does exist, append to it
if (exists("dataset")){
temp_dataset <-read.table(file, header=TRUE, sep="\t")
dataset<-rbind(dataset, temp_dataset)
rm(temp_dataset)
}
}

2.6. What is TCP/IP in R?

TCP/IP is a set of protocols. It is a primary tech of the internet. When we browse the web, send email, chat online, online gaming, TCP/IP is working underneath.

a) What is the protocol?

A protocol is a set of rules and procedures. It means what format to use, what data mean, when should the data be sent. When two computers exchange data, they can understand each other if both follow a specific format and rules in a protocol. It is a set of rules and procedures and computers. It is being used to communicate.

b) What does TCP/IP work?

TCP/IP protocols map to a four-layer conceptual model known as the DARPA model. The four layers of the DARPA model are Application, Transport, Internet, and Network Interface.

2.7. TCP/IP applications, services, and protocols

  • Bootstrap protocol – Bootstrap Protocol (BOOTP) provides a dynamic method for associating workstations to servers. It is a method which also provides a dynamic method for assigning workstation Internet Protocol (IP) addresses and initial program load (IPL) sources.
  • Domain name system – We use Domain Name System (DNS) to manage, host names and their associated Internet Protocol (IP).
  • Email – Use this information to plan for, configure, use, manage, and troubleshoot e-mail on your system.
  • Open shortest path first search – IBM I support includes the Open Shortest Path First (OSPF) protocol. OSPF is a link-state, hierarchical Interior Gateway Protocol (IGP) for network routing.
  • RouteD – The Route Daemon (RouteD) provides support for the Routing Information Protocol (RIP) on the IBM i platform.
  • Simple network time protocol – It is a time-maintenance application that we can use to synchronize hardware in a network.

2.8. TCP/IP variable SMC-R storage allocations

  • Each more RMB that is been allocated for a particular SMC-R link group can accommodate 4 – 32 more TCP connections, depending on the RCVBFRSIZE value of the TCP connections.
  • More staging buffers are allocated as the volume of application data that is been sent increases.
  • RMBs, staging buffers, and RDMA receive and send elements are all eligible to be deallocated if the volume of application traffic decreases.

2.9. What are Sockets in R

Sockets provide two networked machines with a bidirectional communication channel. Servers are accessed via socket addresses, a combination of the server’s IP address and a port number. We use the port as a connection point on the server, like USB or Firewire ports, with each port serving a specific purpose.

For example:

Web pages are served on port 80 (HTTP), emails are sent via port 25 (SMTP).

Usage

make.socket(host = "localhost", port, fail = TRUE, server = FALSE)

Arguments

host – name of remote host

Port – to connect to/listen to

fail – failure to connect is an error?

Server – a server socket?

3. Conclusion

In this tutorial, we have studied about different input-output features in R Programming. Along with this, we have studied series of functions which request to take an input from the user and make easier to understand the data. As we use functions to access data from the user and have different ways to read and write graph.TCP/IP is a set of protocols which is also a way of accessing the data.

If you have any query related to Input-output features in R, so feel free to share with us.

See Also-

 

Leave a comment

Your email address will not be published. Required fields are marked *