grep

# cat file.txt
This is a file with some text in it.
THIS IS A FILE WITH SOME TEXT IN IT.

Let’s try some grep flags and examine what will they print to STDOUT.


Example 1) Find me all lines with word “This” in them:

# grep "This" file.txt
This is a file with some text in it.

grep is case sensitive when -i flag is omitted. Because of that, only line with word “This” is printed, and line with “THIS” is omitted.


Example 2) Find me all lines with words “this” in them (uppercase, lowercase, I don’t care)

# grep -i "This" file.txt
This is a file with some text in it.
THIS IS A FILE WITH SOME TEXT IN IT.

Different output this time. Now both lines are matched as -i flag tells grep to ignore case sensitivity. This will match “THIS”, “this”, “tHiS” and all variations in-between.


Example 3) Find me all lines that do NOT contain words “This”, then find all lines that do NOT contain words “THIS”, and lastly find all lines that do not have word “this” at all.

# grep -v "This" file.txt
THIS IS A FILE WITH SOME TEXT IN IT.

# grep -v "THIS" file.txt
This is a file with some text in it.

# grep -vi "This" file.txt
#

# grep -vi "THIS" file.txt
#

The -v flag inverts match.
First command would not show line with “This”, but will show all other lines (THIS and this would match).
Second command would not show “THIS”, but will show line with “This” (This and thiS would match).
Why third and fourth examples have no output? Because we included -i flag that excludes all variations of “this” word, so basically says dont show me lines that have any variation of word “this” (uppercase, lowercase, whatever)


Example 4) Okay, let’s change our text a bit and see what else we can do with grep. Consider the following text in the file.

# cat example.txt
set root=(md/md1)
set prefix=(md/md1)/grub
insmod linux
linux /vmlinuz-4.19.154a_x64 root=/dev/md0 ro
initrd /initramfs-4.19.154a_x64.img
boot

We want to search for all lines with word “root” in them, so we grep them like that:

# grep "root" example.txt
set root=(md/md1)
linux /vmlinuz-4.19.154a_x64 root=/dev/md0 ro

Clearly, we used grep to pull all lines that have word “root” in it.


Example 5) I need to find all lines with word “root”. Also, I want to print two more lines after “root” match. We can try this:

# grep "root" -A 2 example.txt
set root=(md/md1) 
set prefix=(md/md1)/grub
insmod linux
linux /vmlinuz-4.19.154a_x64 root=/dev/md0 ro
initrd /initramfs-4.19.154a_x64.img
boot

Let’s take a look a bit on the output here. It found line with “root” in it, and printed two more lines after it (-A for after, 2 for two lines). Again, it found line with other “root” word in it, and printed two following lines.


Example 6) Find word “boot” and print two lines above it, and two lines before it.

# grep "boot" -B 2 -A 2 example.txt
linux /vmlinuz-4.19.154a_x64 root=/dev/md0 ro
initrd /initramfs-4.19.154a_x64.img
boot

We want grep to find line with word “boot”, and it did. We want to print two lines before it, and two lines after it. Since “boot” is found on last line, there isn’t anything after it, hence -A 2 is useless here.


Example 7) Find lines with word “linux” and print two lines above it and below it. We can try this:

# grep "prefix" -C 2 example.txt
set root=(md/md1)
set prefix=(md/md1)/grub
insmod linux
linux /vmlinuz-4.19.154a_x64 root=/dev/md0 ro

We see it found line with word “prefix” and also tried to print two lines above it, and two lines below it. However, since line with word “prefix” is second in our text, there is only one line above it, hence the result like this.


Examples with regular expressions and grep

Let’s say we have a file that contains this:

# cat elon.txt
Elon Musk Jan 30 2021
SpaceX is owned by Elon

Example 8) Find all lines that start with word “Elon”:

# grep "Elon" elon.txt
Elon Musk Jan 30 2021
SpaceX is owned by Elon

# grep "^Elon" elon.txt
Elon Musk Jan 30 2021

In first command, we simply pulled out all lines that have word “Elon” in them. What is with other command? Well, the second one is using regex (regular expression) to sift through lines even more.
The carrot sign ^ is regex that means “match if line begins with this word”, hence only line that begins with “Elon” will match, and get printed. The “SpaceX is owned by Elon Musk” contains word “Elon”, but “Elon” is not at the beginning of a line.


Example 9) Find all lines that end with word “Elon”:

# grep "Elon" elon.txt
Elon Musk Jan 30 2021
SpaceX is owned by Elon

# grep "Elon$" elon.txt
SpaceX is owned by Elon 

With $ sign, we said grep to match only if word “Elon” is at the end of the line, and ignore all other lines, even if they have word “Elon” somewhere in-between, or at the start.


Example 10) Find out how many lines have word “Elon” in them:

# cat elon.txt
Elon Musk Jan 30 2021
SpaceX is owned by Elon

# grep -c "Elon" elon.txt
2

The -c (–count) would count all lines that have word “Elon” anywhere in them, and print the number at the STDOUT.


Example 11) Find out how many lines do not have word “Elon” in them:

# grep -c -v "Elon" elon.txt
0

With combination of -c and -v, it will count in how many lines word “Elon” is not mentioned, and print the result in STDOUT.


Leave a Reply