Menu

Coding and life :)

Unix sort & uniq

It was around 2 Am and I was working like a caveman,but its hard to escape bed time 😦

Suddenly I found I set a wrong cron job in a cloud and it generated duplicate results.I have to make a report from the cron output and every line should be unique.The file is around 1.2 GB.

It was a json file, that has several thousand lines,many of them are redundant.I have to remove the redundant values and make a file which every line is unique.

I started to write a python script to do that,I was on the half way to finish my python script that takes file and create another file that contains uniqu elements from the input file.As I was too tired,thought I should do a search is there any unix command to this job.And found exactly what I needed 🙂

sort filename.txt | uniq

Or

cat filename.txt | sort -u

If the input file contans:

Line 1
Line 2
Line 2
Line 3
Line 1
Line 3

The command generates

Line 1
Line 2
Line 3

And I just redirected the output of the command into a new file like below:

sort filename.txt | uniq > result.txt

Explanation of the command:

‘sort’ command lists all the lines by default alphabetically and ‘uniq’command can eliminate or count duplicate lines in a pre sorted file.

You can also use sort and uniq in different situation, for details check following links: