How to Use Awk to Filter Text or Strings Using Pattern Specific Actions

In the third part of the Awk command series, we shall take a look at filtering text or strings based on specific patterns that a user can define.

Sometimes, when filtering text, you want to indicate certain lines from an input file or lines of strings based on a given condition or using a specific pattern that can be matched. Doing this with Awk is very easy, it is one of the great features of Awk that you will find helpful.

Let us take a look at an example below, say you have a shopping list for food items that you want to buy, called food_prices.list. It has the following list of food items and their prices.

From the output above, you can see that the there is a (*) sign at the end of the lines having food items, mangoes and pineapples. If you check their prices, they are above $2.

In this example, we have used used two patterns:

the first: / *\$[2-9]\.[0-9][0-9] */ gets the lines that have food item price greater than $2 and

the second: /*\$[0-1]\.[0-9][0-9] */ looks for lines with food item price less than $2.

This is what happens, there are four fields in the file, when pattern one encounters a line with food item price greater than $2, it prints all the four fields and a (*) sign at the end of the line as a flag.

The second pattern simply prints the other lines with food price less than $2 as they appear in the input file, food_prices.list.

This way you can use pattern specific actions to filter out food items that are priced above $2, though there is a problem with the output, the lines that have the (*) sign are not formatted out like the rest of the lines making the output not clear enough.

Aaron Kili is a Linux and F.O.S.S enthusiast, an upcoming Linux SysAdmin, web developer, and currently a content creator for TecMint who loves working with computers and strongly believes in sharing knowledge.

Your name can also be listed here. Got a tip? Submit it here to become an TecMint author.

You don’t say why there’s a space and then a *, given that in a previous post you said that . means any character and * should mean 0 or however many of the proceeding character.

Then there’s a ; after print, which again you don’t explain – might be meaningless after all, but when you explain to inexperienced users, you shouldn’t leave out so many things. Normally the ; is not necessary, but I suppose you’re writing it for consistency. You don’t explain what %-10s is and so on, and so forth.

I’ve been following tecmint for quite a lot time and I like it, but these types of posts seem to work only as solutions to problems users had thought of before hand. They’re not really tutorials.

In other contexts being so pragmatic should work (such as setting up a web server or a mail server, where you simply want it to work), but here people who want to learn need much more detail. In my opinion, the article should have been double in size.

Moreover, the gif image is really hard to follow. When you try to concentrate on how awk filters the text, you need to see the output permanently, so as to compare it to the original and understand how awk syntax works. It’s quite frustrating, to be honest.

At first glance, Gurpreet Singh’s actually seems simpler, as his syntax is more self-explanatory in a way than yours.