Gawk Command in Linux

FREE Online Courses: Dive into Knowledge for Free. Learn More!

In this article, you will learn about the gawk command in Linux-based operating systems. We will look at the gawk command, why it is used, how to install it, its syntax, and the options used along with it. We will also look at examples of the gawk command by pairing it with various options to understand its working. So pay attention, take notes, and read to the end for the best benefits.

What is gawk command in linux?

Gawk is a command line-based utility in Linux-based operating systems used for pattern scanning and processing language. Gawk needs no compiling as it allows the user to use variables, numeric functions, string functions, logical operators, and more.

Just like the awk command, gawk also enables programmers to write tiny and effective programs in the form of statements that define text patterns to be searched for in a text document and the action to be taken when a match is found within a line.

The gawk command may seem simple, but it is capable of many things. For example, it can scan files line by line, split inputs, transform data files, and produce formatted outputs. It can also handle arithmetic operations, string operations, conditional statements, loops, and many more.

How does linux gawk work?

The primary purpose of the gawk command is to make text manipulation and information retrieval an easy job in Linux distributions. The command works by scanning a set of input lines and then searching for the lines that match the pattern specified by the user.

The gawk command accepts input data, which is transformed and sent to the standard output. For each pattern gawk recognises, the user can describe an action on each line. Gawk can easily process complex log files and give a readable output.

What is the syntax of linux gawk command?

The syntax of the gawk command might look slightly intimidating at first, but once we understand the fields in its syntax, it becomes a cakewalk. The syntax of the gawk command is as shown below.

gawk <options> -f <program file> 

Let us see the fields present in gawk command syntax.

1. <options>

This field takes in a range of options that specify how the gawk command must function, format, and print the output. You can either write the option in POSIX or GNU styles.

2. <program file>

It takes in the file’s name where you have written the program. To do so, you must use the option “-f”. You can also write your program in one line without using a program file.

Options used with linux gawk command

Compared to most Linux commands, the number of options used with the gawk command is considerably small. Let us take a brief look at each one of them.

1. -f

As we discussed above, this option reads the AWK program source from the file you specified instead of from the first command line argument. You can also write this option as “–file.”

2. -F

This option uses “f” s for the input field separator. This option can also be written as “–field-separator.”

3. -v

Before the execution of the program begins, this option assigns the value you specified to the variable you chose. This option can also be written as “–assign.”

4. -b

This option treats all input data as single-byte characters. You can also write this option as “–characters-as-bytes.”

5. -c

This option runs the gawk command in compatibility mode, where gawk behaves precisely like the awk command. This option can also be written as “–traditional.”

6. -d

This option prints a sorted list of global variables, their types, and final values to the file you specified. This option can also be written as “–dump-variables.”

7. -C

This option prints the short version of the GNU Copyright information message, as shown below.

printing gnu copywrite information message

8. -e

This option allows the easy intermixing of library functions with source code entered on the command programs used in shell scripts. You can also write this option as “–source”

9. -g

This option scans and parses the AWK program, and generates a GNU .pot (Portable Object Template) format file on standard output with entries for all localizable strings in the program.

10. -L

This option provides warnings about constructs that are dubious or non-portable to other AWK implementations. This option can also be written as “–lint”

11. -n

This option recognizes octal and hexadecimal values in input data. We can also write this as “–non-decimal-data”

12. –help

This option displays the help menu of the gawk command as shown below:

help menu of the gawk command

13. -O

This option enables optimisations upon the internal representation of the program. This option can also be written as “optimise.”

14. -r

This option enables the use of interval expressions in regular expression matching. You can also write this option as –re-interval.

15. -N

This option forces the gawk command to use the locale’s decimal point character when parsing input data. You can also write this option as “–use-lc-numeric.”

16. -V

This option displays the version of the gawk command you are using.

version of the gawk command

Default behaviour of linux gawk command

Let us consider the text file shown below, which contains five names along with phone numbers:

sample text file to test the gawk command

Now, if we use the gawk command shown below, it will print the file’s contents:

gawk '{print}' <filename>.txt.

default behavior of the gawk command

Printing lines that match a specific pattern

Like the grep command, we can also use the gawk command to print out the lines that match the specific pattern. To do so, execute the gawk command by using the syntax shown below:

gawk '/<pattern>/ {print}' <filename>

printing lines that match a specific pattern

Printing only a specific column of the file

To print only a specific column of a file, use the command shown below:

gawk '{print $<column number>}' <filename>.txt

printing only a specific column of the file

Displaying the count of the lines

If you want to display the count of the lines on the left-hand side, you can use the gawk command as shown below:

gawk '{print NR, $0}' <filename>.txt

displaying the count of the lines

Finding the length of the longest line present in the file

If you want to find out the length of the longest line that is present in a file, you can use the command shown below:

gawk '{ if (length($0) > max) max = length($0) } END { print max }' <filename>.txt

finding the length of the longest line present in the file

Counting the number of lines in the file

To count the number of lines in a file, use the following command:

gawk' END { print NR }' <filename>.txt.

counting the number of lines in the file

Printing lines with more than a specific characters

If you want to print the lines that contain more than the specified number of characters, you can make use of the command shown below:

gawk 'length($0) > <number of characters>' <filename>.txt

printing lines with more than a specific characters

Built-in variables of linux gawk command

The gawk command also has a couple of in-built variables used for different purposes. Let us take a brief look at them, along with an example.

1. NR: It keeps the current count of the input line number.

Example:

gawk '{print NR "-" $1 }' mobile.txt

built in variables of the gawk command

2. NF: It counts the number of fields within the current input record.

3. FS: It contains the field separator character, which divides fields on the input line.
Example:

gawk 'BEGIN{FS=":"; RS="-"} {print $1, $6, $7}' /etc/passwd

testing rs built in variable of the gawk command

4. RS: It stores the current record separator character.

5. ORS: It stores the output record separator, which separates the output lines when Awk prints them.

6. OFS: It stores the output field separator, which separates the fields when Awk prints them.
Example:

gawk 'BEGIN{FS=":"; OFS="-"} {print $1, $6, $7}' /etc/passwd

testing the ofs built in variable of the gawk command

Summary

As you have seen, the gawk command is a simple yet slightly complicated Linux utility used for pattern scanning and language processing. It allows programmers to write tiny and effective programs in statement forms that define text patterns to be searched.

Your 15 seconds will encourage us to work even harder
Please share your happy experience on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *