Thursday, January 18, 2018

The Uniq Command Tutorial With Examples For Beginners

https://www.ostechnix.com/uniq-command-tutorial-examples-beginners

If you’re working mostly on command line and dealing with a lot of text files every day, you should be aware of Uniq
command. This command helps you to find repeated/duplicate lines from a
file easily. It is not just for finding duplicates, but also we can use
uniq command to remove the duplicates, display the number of
occurrences of the duplicate lines, display only the repeated lines and
display only the unique lines etc. Since the uniq command is part of GNU
coreutils package, it comes preinstalled in most Linux distributions.
Let us not bother with installation and see some practical examples.

Please note that the ‘uniq’ command will not detect repeated lines
unless they are adjacent. So, you might need to sort them first or
combine the sort command with uniq to get the results. Allow me to show
you some examples.
First, let us create a file with some duplicate lines.

vi ostechnix.txt

welcome to ostechnix
welcome to ostechnix
Linus is the creator of Linux.
Linux is secure by default
Linus is the creator of Linux.
Top 500 super computers are powered by Linux

As you see in the above file, we have few repeated lines (the first, second, third, and fifth lines are duplicates).

1. Remove consecutive duplicate lines in a file using Uniq command

If you use ‘uniq’ command without any arguments, it will remove all
consecutive duplicate lines and display only the unique lines.

uniq ostechnix.txt

Sample output would be:
As you can see, uniq command removed all consecutive duplicate lines
in the given file. You might also have noticed that the above output
still has the duplicates in second and fourth lines. It is because the
uniq command will only omit the repeated lines only if they are
adjacent. We can, of course, remove that non-consecutive duplicates
too. Look at the second example below.

2. Remove all duplicate lines

sort ostechnix.txt | uniq

Sample output would be:
See? There are no duplicates or repeated lines. In other words, the
above command will display each line once from file ostechnix.txt. We
used the sort command in conjunction with uniq, because, as I already
mentioned, uniq will not find the duplicate/repeated lines unless they
are adjacent.

3. Display only unique lines from a file

To display only the unique lines from a file, the command would be:

sort ostechnix.txt | uniq -u

Sample output:

Linux is secure by default
Top 500 super computers are powered by Linux

As you can see, we have only two unique lines in the given file.

4. Display only duplicate lines

Similarly, we can also display duplicates lines from a file like below.

sort ostechnix.txt | uniq -d

Sample output:

Linus is the creator of Linux.
welcome to ostechnix

These two are the repeated/duplicated lines in ostechnix.txt file. Please note that -d (small d) will only print duplicate lines, one for each group. To print all duplicate lines, use -D (capital d) like below.

sort ostechnix.txt | uniq -D

See the difference between both flags in the below screenshot.

5. Display number of occurrences of each line in a file

For some reason, you might want to check how many times a line is repeated in the given file. To do so, use -c flag like below.

sort ostechnix.txt | uniq -c

Sample output:

2 Linus is the creator of Linux.
1 Linux is secure by default
1 Top 500 super computers are powered by Linux
2 welcome to ostechnix

We can also display number of occurrences of each line along with that line, sorted by the most frequent like below.

sort ostechnix.txt | uniq -c | sort -nr

Sample output:

2 welcome to ostechnix
2 Linus is the creator of Linux.
1 Top 500 super computers are powered by Linux
1 Linux is secure by default

6. Limit the comparison to ‘N’ characters

We can limit the comparison to a particular number of characters of lines in a file using -w
flag. For example, let us limit the comparison to first 4 characters of
lines in a file and display the repeated lines as shown below.

uniq -d -w 4 ostechnix.txt

7. Avoid the comparison with the first ‘N’ characters

Like limit comparison to N characters of lines in a file, we can also avoid comparing the first N characters using -s flag.
The following command will avoid the comparison with the first 4 characters of lines in a file:

uniq -d -s 4 ostechnix.txt

To avoid comparing the first N fields instead of characters, use ‘-f’ flag in the above command.
For more details, refer the help section;

And, that’s all for today! I hope you now get a basic idea about uniq
command and its purpose. If you find our guides useful, please share
them on your social, professional networks and support OSTechNix. More
good stuffs to come. Stay tuned!
Cheers!