Regular Expression in Linux

FREE Online Courses: Transform Your Career – Enroll for Free!

In this article, you will learn all there is to shell programming in Linux-based operating systems. We will go through what regex is, why we use them, what are the different regular expression in linux, the types of regex, and examples of each in detail. So sit down, grab a snack, and read right till the end!

What is Linux Regex?

First of all, regex is the abbreviation for “Regular Expressions”. However, they are not as regular as they sound! Regular expressions are special characters in Linux-based operating systems that help us search data and match a complex pattern.

Regular expressions in linux are most commonly used with commands like grep, sed, tr, ed, awk, and vi. However, we will be focusing on the usage of regular expressions with the grep command in this article, not to mention that there will be a few honorable mentions of other commands as well!

Regex is a really powerful command-line-based tool that helps in describing several sequences of characters. REGEX is also called as REGEXP.

Types of Regular Expressions in Linux

In this article let us take divide the regular expressions into the following 3 types while trying the understand REGEX. We will also look at how to use each expression in the terminal to. The 3 types of REGEX are:

1. Basic regular expressions

2. Interval regular expressions

3. Entended regular expressions

Let us now take a closer look at each of these types:

1. Basic regular expressions

Before we look at the practical examples of the basic regular expressions, let us cumulatively look at the list of the basic regular expressions:

a. .

This basic regular expression replaces any character.

b. ^

This basic regular expression matches the start of the string.

c. $

This basic regular expression matches the end of the string.

d. *

This basic regular expression matches up zero or more times the preceding character.

e. \

This basic regular expression represents special characters.

f. ()

This basic regular expression groups regular expressions.

g. ?

This basic regular expression matches up with exactly one character

Let us now look at an example for each of the basic regexes:

We know that to use regular expressions we need a text file, for the sake of an example, let us consider the file “fruits.txt” that contains a very long list of fruits!

a. Using ‘dot’ to match string

Using the dot (.) expression, we can try to find a string even if we don’t know the full string. We can use the dot expression in places of the character we don’t know.

using dot to match string

In the above example, even though we specified the dot expression in the place of “l” in the text “Apple”, the command gave us the lines that contain the text “Apple”.

b. Using ‘caret’ to match the beginning of the string

We can use the caret (^) expression to search for lines that begin with the specified text.

using caret to match the beginning of the string

The above command gave us all the strings beginning with the letter “B”.

c. Using ‘dollar’ to match the end of the string

Just like we have the expression “^” to match the beginning of a string we also have the expression “$” to search to match the end of the string.

using dollar to match the end of the string

The command in the example above prints all the strings that end with the letter “e”.

d. Using ‘asterisk’ to find the repetition of a letter

Use the asterisk (*) expression to match a repetition of a letter in a word. You can print the repetitions all the way from zero to infinite!

using asterisk to find the repetition of a letter

The command in the example prints the words that match the contain the text “Aple” and have any repetitions of the letter “p”, meaning that it will even print out strings like “Appple”, “Appppppple”, Appppppppppple” and so on if they exist.

e. Using ‘backslash’ to match a special symbol

If we want to search for special characters like semicolon (;), colon (:), slashes(/), comma (,) and many more, we use the expression ‘backslash’. We specify the special character you want to search for after the backslash expression.

using backslash to match a special symbol

The command shown in the above screenshot displays all the strings that have a space in them.

f. Using ‘braces’ to match a group of regexp

If we simply want to search for a piece of text in a file, we use the bracket expressions and specify the word we want to search for in them. It must be noted that while using the braces expression with the grep command, we must make use of the option “-E” which is an extended regular expression.

using braces to match a group of regexp

In the above screenshot, the command prints out all the lines with the text “fruit” in them.

g. Using ‘?’ to print all the matching characters

If you want to print out the lines that contain either one of the characters you specify or all of the characters you specify.

using question mark to print all the matching characters

The command shown in the above screenshot prints all the lines that either start with “c”, or “ch”. However, if we run the exact same command but without the “?” expression, we will get the line that starts with “Ch” as shown:

testing output without using the question mark regex

2. Interval regular expressions

These expressions print out the lines that match the occurrence of the character or characters we specify. These are more sophisticated yet simple, let us look at them:

a. {n}

This interval regular expression matches the preceding characters that appear exactly “n” number of times.

b. {n,m}

This interval regular expression matches the preceding character that appears exactly “n” number of times but not more than “m”, meaning it prints repetitions of the character between “n” to “m” number of times.

c. {n,}

This interval regular expression matches the preceding character that appears “n” number of times or more.

Let us now look at an example for each of the 3 interval regular expressions along with the grep command:

a. Using the “{n}” expression

In the command shown below, we used the expression {n} to search for words that have 2 occurrences of the character “p”.

using the {n} expression

b. Using the “{n,m}” expression

In the command shown below, we used the expression {n,m} to search for words that have at least 1 occurrence of “p” and at most 2 occurrences of “p”

using the {n,m} expression

c. Using the “{n,}” expression

In the command shown below, we used the expression {n,} to search for words that have the character “p” at least twice.

using the {n,} expression

3. Extended regular expressions

These expressions help us in finding text where a pattern of string either precedes or succeeds another piece of string. The following are the extended regular expressions:

a. \+

This extended regular expression matches one or more occurrences of the previous character.

b. \?

This extended regular expression matches zero or more occurrences of the previous character.

Let us look at an example for each of the 2 extended regular expressions:

a. Using “\+”

The command in the screenshot below prints all of the occurrences of the cases where the character “t” is preceded by the character “a”.

using followed by a slash

b. Using “\?”

The command in the screenshot below prints all of the occurrences of the cases where the character “t” is preceded by the character “a” and also where only the character “t” is present.

using slash followed by

Brace expansion in Linux

Here is a bonus example of a regular expression – {}. Using brace expansion we can specify a range of things to perform operations on, here are some examples:

examples of brace expansions

One such real-life example of brace expansion is when downloading a continuous range of websites using the wget command:

using brace expansion with the wget command

We can use expressions along with many other multiple commands also.

Table of metacharacters

Even though we have used most of the metacharacter, let us look at everything in one table to get s better picture:

NOEXPRESSIONDESCRIPTION
1.This metacharacter replaces any character.
2^This metacharacter matches the start of the string and represents characters not in the string.
3$This metacharacter matches the end of the string.
4*This metacharacter Matches zero or more times the preceding character.
5\This metacharacter represents the group of characters.
6()This metacharacter Group regular expressions.
7?This metacharacter Matches exactly one character.
8+This metacharacter matches one or more times the preceding character.
9{N}Preceding character is matched exactly N times.
10{N,}Preceding character is matched exactly N times or more.
11{N, M}Preceding character is matched exactly N times, but not more than N times.
12This metacharacter represents the range.
13\bThis metacharacter matches the empty string at the edge of a word.
15\BThis metacharacter matches the empty string if it is not at the edge of a word.
16\<This metacharacter matches the empty string at the beginning of a word.
17\>This metacharacter matches the empty string at the end of a word.

Shell scripting using Regular Expression in Linux

We can also use regular expressions in shell scripting, here are some examples:

1. Using “^” in shell scripting

Here is a shell program to print words starting the with letter “B”.

using caret in shell scripting

output of shell program using caret

2. Using “*” in shell scripting

Here is a shell program to print words having occurrences of “ap” in them.

using asterisk in shell program

output of the shell program using asterisk

3. Using “?” in shell scripting

Here is a shell program to print words having occurrences of “ch” in them

using question mark in shell program

output of shell program using question mark

Summary

As you have seen, regular expressions are a simple set of operators that make life so much easier as they improve th efficiency of workflow. You have now learned what operators are, why they are used and the types of operators, where we covered the basic, interval, and extended types of regular expressions along with examples.

Did you like this article? If Yes, please give DataFlair 5 Stars on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *