csplit command in Linux with examples

FREE Online Courses: Elevate Skills, Zero Cost. Enroll Now!

In this article, we will learn everything about the csplit command in Linux-based operating systems. We will look at the csplit command, why it is used, how to install it, the syntax of the csplit command, and the options used along with it. In the end, we will also look at some fantastic practical examples of the csplit command by pairing it with various options to understand its working. So pay attention, and read to the end for the best benefits.

What is linux csplit command?

csplit is a command line-based utility in Linux-based operating systems that is used to split any file into many parts as required by the user. Technically speaking, it copies the specified file and separates the copy into segments. However, the best part about the csplit command is that the input text file remains unaltered.

Upon splitting the file, the csplit command writes the segments to files xx00, xx01, xx02, . . . , xx87, xx98, xx99, depending on the number of parameters you specified. The maximum number of segments you can split a file using the csplit command is 99.

It might sound a little intimidating at first, but once you understand the working of the csplit command, it becomes a cakewalk. Let us understand this with the help of an example. Consider the command shown below:

csplit <filename> 11 72 98

The csplit command shown above creates four files, they are:
1. xx00 – This file contains lines 1 to 10
2. xx01 – This file contains lines 11 to 71
3. xx02 – This file contains lines 72 to 97
4. xx03 – This file contains lines 98 to the last line

See, it is as simple as that! The xx00 file contains the lines from the beginning of the original file up to the line number specified in the first Argument parameter. However, you must note that the first argument parameter line number is not included. Instead, it is only till the line preceding it.

Similarly, the xx01 file contains lines beginning with the number specified by the first Argument parameter up to the line referenced by the second Argument parameter. However, the second parameter line is not included as it stops by the previous line of the second parameter.

What is the syntax of linux csplit command?

The syntax of the csplit command might look slightly intimidating at first, but once we understand the fields present in the syntax, it will become a cakewalk! The syntax of the csplit command is shown below:

csplit <OPTIONS> <FILE> <PATTERN>

Let us look at the fields present in the syntax of the csplit command.

1. <OPTIONS>

This field takes in a range of options that specify how the csplit command must function, format, and print the output. You can also specify multiple options in this field.

2. <FILE>

This field takes in the name of the file you want to split. Although, as we discussed earlier, this file remains unaltered, the csplit command creates new split files. If the file you want to split is not in your current working directory, you can specify the entire path of the file instead of going to that location.

3. <PATTERN>

This field takes in the arguments as to how you want to split the input file. The number of split files will be one more than the arguments you specified. These arguments are basically the line number where you want to split your input file.

Options used with linux csplit command

Unlike most commands in Linux, the comm command comes with very few options. Let us take a detailed look at each.

1. -b

This option uses the sprintf format instead of the default %02d. This adoption can also be written as “–suffix-format.”

2. -f

This option uses the prefix specified in the file name instead of the default “xx”. This means the files are no longer named xxoo, xx01, etc. You can also write this option as “–prefix.”

3. -k

This option does not remove the output files on errors. This option can also be written as “–keep-files.”

4. -n

This option uses the number of digits you specified instead of 2. This option can also be written as “–digits.”

5. -s

This option enables quiet mode, where the count of output file sizes is not printed. The quiet mode also suppresses errors. This option can also be written as “–quiet” or “–silent.”

6. -z

This option removes empty output files. You can also write this option as “–elide-empty-files.”

7. –help

This option displays the help menu of the csplit command, as shown below:

help menu of the csplit command

8. –version

This option displays the version of the split command you are using in your system.

version of the csplit command

Now, since we have covered the theory and fundamentals regarding the ionice command, let us look at some fantastic examples of the ionice command in the terminal of ubuntu 20.04.

Before we look into some examples of using the csplit command, let us consider the input file shown below:

sample text file to test the csplit command

Splitting a file into 2 parts

If you want to split a file into two parts, use the syntax shown below:

csplit <filename> <parameter>

For example, the command “csplit file 3” will split the file into two parts as shown in the screenshot below:

 

output of splitting a file

The first file (xx00) will contain lines 1 and 2, whereas the second file (xx01) will contain lines 3 to 10, as shown in the screenshot below:

splitting the file into two parts

 

Naming the split files

We saw that by default, the csplit command names the split files xx00, xx01, and so on. If you want to replace the xx with a prefix of your choice, all you have to do is execute the csplit command by pairing it with the option “-f” followed by the prefix as shown below:

csplit -f <prefix> <filename> <parameter>

naming the split files

Keeping output files on errors

If you encounter an error while splitting a file, the csplit command removes all output files. Here is an example. In the screenshot below, the split command produces an error as there is no line 15. Due to this error, there are no output files:

keeping output files on errors

However, we can still keep the files even if there is an error by using the option “-k”.

Changing the number of digits in the filename

We saw that in the default naming of the output files, the csplit command uses two digits (00, 01, 02, and so on). However, if you want the number of digits to be changed, you can use the option “-n” followed by the number of digits.

For example, if you specify “-n 3,” the output files will be named xx000, xx001, xx002, and so on. If you specify the number of digits to be one (“-n 1”), the output files will be named xx0, xx1, xx2, and so on.

changing the number of digits in the filename

Not printing the file sizes

If you observe the previous output, you will notice that when you split the input file, the csplit command will immediately display the sizes of the output files. If you don’t want this printed, you can enable quiet mode by using the option “-q” as shown:

csplit -q <file> <parameteres>

not printing the file sizes

Deleting empty files

Sometimes, when we use the csplit command to separate data in the input file, we can get empty output files. We can avoid this by using the option “-z.”

deleting empty files

Summary

As you have seen, the csplit command is a simple tool to split files and copies them to other files. You have now learned what the csplit command is, why it is used, the syntax of the csplit command, and the options used along with it. You have also learned how to use the options of the split command, as we have seen live examples of the csplit command in the terminal of Ubuntu 20.04 LTS.

Did you like our efforts? If Yes, please give DataFlair 5 Stars on Google

follow dataflair on YouTube

Leave a Reply

Your email address will not be published. Required fields are marked *