R String Manipulation Functions and Usage

1. R String Manipulation – Objective

In this blog on R String Manipulation, we are going to cover the R string manipulation functions. There are 8 String manipulation functions in R such  as grep(), nchar(), paste(), sprintf(), substr(), trsplit(), regex(), gregexpr(). All these R String Manipulation functions will be discussed in this R tutorial along with their usage.

So, let’s start R String Manipulation tutorial.

R String Manipulation Functions and Usage

R String Manipulation Functions and Usage

2. What is String Manipulation in R?

Generic programming in an OpenCL program was restricted to using a string manipulation mechanism, where the program was constructed as a string at runtime and then passed to the OpenCL driver fronted, that will finally compile and build the kernel at runtime. Command group that call kernels can also be templated, allowing for a complex position of functors and types.
There are so many functions available for string manipulation:

  • grep()
  • nchar()
  • paste()
  • sprintf()
  • substr()
  • strsplit()
  • regex()
  • gregexpr()

3. R String Manipulation Functions

Now we will discuss the above mention R String manipulation functions with their usage.

i. grep()

It is used for pattern matching and replacement. grep, grepl, regexpr, gregexpr and regexec search for matches to argument pattern within each element of a character vector. sub and gsub perform replacement of the first and all matches.
Keywords

Utilities, character

Usage

grep(pattern, x, ignore.case = FALSE, perl = FALSE, value = FALSE,
fixed = FALSE, useBytes = FALSE, invert = FALSE)
grepl(pattern, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
sub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
gsub(pattern, replacement, x, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
regexpr(pattern, text, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
gregexpr(pattern, text, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)
regexec(pattern, text, ignore.case = FALSE, perl = FALSE,
fixed = FALSE, useBytes = FALSE)

Arguments

  • pattern – Character string containing a regular expression that should match in the given character vector.
  • x, text – An object which can be restricted as.character to a character vector.
  • ignore.case – If FALSE, the pattern matching is case sensitive and if TRUE, a case will ignore during matching.
  • perl – Should Perl-compatible regexps be used?
  • value – A vector containing the indices of the matches determined by grep will return then it is FALSE. A vector containing the matching elements themselves will return then it is TRUE.
  • fixed – If TRUE then a pattern is a string that should match as is and it will Override all conflicting arguments.
  • useBytes – If TRUE then the matching will do byte-by-byte rather than character-by-character.
  • invert – If TRUE then it will return indices or values for elements that do not match.
  • replacement – A replacement for the matched pattern in sub and gsub.

ii. nchar()

It Counts the Number of Characters. nchar takes a character vector as an argument and returns a vector whose elements contain the sizes of the corresponding elements of x. To find out if elements of a character vector are non-empty strings or not then nzchar is the fastest way.
Keywords
character
Usage

nchar(x, type = "chars", allowNA = FALSE, keepNA = NA)
nzchar(x, keepNA = FALSE)

Arguments

  • x – character vector or a vector will be restricted to a character vector. Giving a factor is an error.
  • type – character string: partial matching to one of c(“bytes”, “chars”, “width”).
  • allowNA – Should NA will return for invalid multibyte strings or “bytes”-encoded strings
  • keepNA – The default for nchar(), NA, means to use keepNA = TRUE unless type is “width”. Used to be hard coded to FALSE in R versions ≤ 3.2.0.

iii. paste()

Concatenate Strings.Concatenate vectors after converting to the character.
Keywords
Character
Usage

paste (…, sep = " ", collapse = NULL)
paste0(…, collapse = NULL)

Arguments

  • … – One or more R objects will convert to character vectors.
  • sep – A character string to separate the terms. Not NA_character.
  • collapse – An optional character string to separate the results. Not NA_character.

iv. sprintf()

Use C-style String Formatting Commands.
Keywords
print, character
Usage

sprintf(fmt, …)
gettextf(fmt, …, domain = NULL)

Arguments

  • fmt – a character vector of format strings, each of up to 8192 bytes.
  • … – values will pass into fmt.
  • domain – see gettext.

v. substr()

Substrings of a Character Vector.Extractor replaces substrings in a character vector.
Keywords
Character
Usage

substr(x, start, stop)
substring(text, first, last = 1000000L)
substr(x, start, stop) <- value
substring(text, first, last = 1000000L) <- value

Arguments

  • x, text – a character vector.
  • start, first – integer. The first element that should be replaced
  • stop, last – integer. The last element that should be replaced.
  • value – a character vector, recycled if necessary.

vi. strsplit()

Keywords
Character
Usage
strsplit(x, split, fixed = FALSE, perl = FALSE, useBytes = FALSE)
Arguments

  • x – It is a character vector, each element of which is to be split.
  • split – It is a character vector containing regular expression(s) to use for splitting.
  • fixed – If it is TRUE then it will match split exactly.
  • perl – Should Perl-compatible regexps be used?
  • useBytes – If is TRUE then the matching will do byte-by-byte rather than character-by-character, and inputs with marked encodings are not converted.

vii. regex()

Create a regex.Creates a regex object. Build regular expressions in a human-readable way.
Usage

regex(...)
perl_regex(...)
## S3 method for class 'perl_regex':
format(x, ...)

Arguments

  • …. – Passed to paste0.
  • x – A regex.

viii. gregexpr()

An extension of the base function gregexpr enabling retrieval of the matching substrings.
Keywords
Gregexpr
Usage

gregexpr(pattern, text, ignore.case=FALSE, perl=FALSE,
fixed=FALSE, useBytes=FALSE, extract=FALSE)

Arguments

  • pattern – character string containing a regular expression that should match in the given character vector.
  • text – an object which will be restricted by as.character to a character vector.
  • ignore.case – If it is FALSE, then the pattern matching is case sensitive but if TRUE, then the case will ignore during matching.
  • perl – Should perl-compatible regexps be used?
  • fixed – if it is TRUE, then a pattern is a string that should matches as is. Overrides all conflicting arguments.
  • useBytes – If it is TRUE then the matching should be done byte-by-byte rather than character-by-character.
  • extract – If logical indicating matches then substrings need to extract and returned.

These are the function used in R string Manipulation.

4. Regular Expressions

A set of strings will define as regular expressions. We use two types of regular expressions in R, extended regular expressions (the default) and Perl-like regular expressions used by perl = TRUE.
Regular expression syntax
It specifies characters to seek out, with information about repeats and location within the string. This will practice with the help of metacharacters that have a specific meaning: $ * + . ? [ ] ^ { } | ( ) \

R Quiz

5. Use of String Utilities in the edtdbg Debugging Tool

The internal code of the edtdbg debugging tool makes heavy use of string utilities. A typical example of such usage is the dgbsendeditcmd() function:

# send command to editor</span>
dbgsendeditcmd <- function(cmd) {
syscmd <- paste("vim --remote-send ",cmd," --servername ",vimserver,sep="")
system(syscmd)
}

The main point is that edtdbg sends remote commands to the Vim text editor. For instance, if we are running Vim with a server name of 168 and we want the cursor in Vim to move to line 12, we could type this into a terminal (shell) window:
vim –remote-send 12G –server name 168
The effect would be the same as if you had typed.
So, this was all on R string Manipulation.

6. Conclusion

As we all are aware of what does string Manipulation refer to. Hence, in this tutorial on R string Manipulation, we have studied about the use of string and their function with their usages. Along with strings, it is also necessary to learn how to express this strings. So going forward we have learned regular expression also. Still, if you have any doubt regarding R string Manipulation, ask in the comment tab. 
See Also-

Reference for R

Leave a Reply

Your email address will not be published. Required fields are marked *

This site is protected by reCAPTCHA and the Google Privacy Policy and Terms of Service apply.