Thursday, October 18, 2012

Regular Expressions in Linux Explained with Examples

http://www.linuxnix.com/2011/07/regular-expressions-linux-i.html

Regular expressions (Regexp)is
one of the advanced concept we require to write efficient shell scripts
and for effective system administration. Basically regular expressions
are divided in to 3 types for better understanding.1)Basic Regular expressions2)Interval Regular expressions (Use option -E for grep and -r for sed)3)Extended Regular expressions (Use option -E for grep and -r for sed)Some FAQ’s before starting Regular expressionsWhat is Regular expressions?
A regular expressions is a concept of matching a pattern in a given string.Which commands support regular expressions?
vi, tr, grep, sed, awk, perl, python etc.

Basic Regular Expressions

Basic regular expressions:
This set includes very basic set of regular expressions which do not
require any options to execute. This set of regular expressions are
developed long time back.

^ –Caret/Power symbol to match a starting at the beginning of line.$ –To match end of the line* –0 or more occurrence of previous character.. –To match any character[] –Range of character[^char] –negate of occurrence of a character set\ –Actual word finding\ –Escape character
Lets start with our Regexp with examples, so that we can understand it better.

^ Regular Expression

Example 1: Find all the files in a given directoryls -l | grep ^-
As you are aware that the first character in ls -l output, -
is for regular files and d for directories in a given folder. Let us
see what ^- indicates. The ^ symbol is for matching line starting, ^-
indicates what ever lines starts with -, just display them. Which
indicates a regular file in Linux/Unix.
If we want to find all the directories in a folder use grep ^d option along ls -l as shown belowls -l | grep ^d
How about character files and block files?ls -l | grep ^cls -l | grep ^b
We can even find the lines which are commented using ^ operator with below examplegrep ‘^#’ filename
How about finding lines in a file which starts with ‘abc’grep ‘^abc’ filename
We can have number of examples with this ^ option.

$ Regular Expression

Example 2: Match all the files which ends with shls -l | grep sh$
As $ indicates end of the line, the above command will list all the files whose names end with sh.
how about finding lines in a file which ends with deadgrep ‘dead$’ filename
How about finding empty lines in a file?grep ‘^$’ filename

* Regular Expression

Example 3:Match all files which have a word twt, twet, tweet etc in the file name.ls -l | grep ‘twe*t’
How about searching for apple word which was spelled wrong in a given
file where apple is misspelled as ale, aple, appple, apppple, apppppple
etc. To find all patternsgrep ‘ap*le’ filename
Readers should observe that the above pattern will match even ale
word as * indicates 0 or more of previous character occurrence.

. Regular Expression

Example 4:Filter a file which contains any single character between t and t in a file name.ls -l | grep ‘t.t’
Here . will match any single character. It can match tat, t3t, t.t, t&t etc any single character between t and t letters.
How about finding all the file names which starts with a and end with x using regular expressions?ls -l | grep ‘a.*x’
The above .* indicates any number of charactersNote: .* in this combination . indicates any character and it repeated(*) 0 or more number of times.
Suppose you have files as..awxawexaweexawasdfxa35dfetrx
etc.. it will find all the files/folders which start with a and ends with x in our example.

[] Square braces/Brackets Regular Expression

Example 5: Find all the files which contains a number in the file name between a and xls -l | grep ‘a[0-9]x’
This will find all the files which isa0xsdf
asda1xsdfas
..
..
asdfdsara9xsdf
etc.
So where ever it finds a number it will try to match that number.
Some of the range operator examples for you.
[a-z] –Match’s any single char between a to z.
[A-Z] –Match’s any single char between a to z.
[0-9] –Match’s any single char between 0 to 9.
[a-zA-Z0-9] – Match’s any single character either a to z or A to Z or 0 to 9
[!@#$%^] — Match’s any ! or @ or # or $ or % or ^ character.
You just have to think what you want match and keep those character in the braces/Brackets.

[^char] Regular Expression

Example6: Match all the file names except a or b or c in its filenamesls | grep ’[^abc]‘
This will give output all the file names except files which contain a or b or c.

\ Regular expression

Example7: Search for a word abc, for example I should not get abcxyz or readabc in my output.grep ‘\’ filename

\ Escape Regular Expression

Example 8:Find files which contain [ in its name, as [ is a special charter we have to escape itgrep "\[" filenameorgrep '[[]‘ filenameNote: If you
observe [] is used to negate the meaning of [ regular expressions, so if
you want to find any specail char keep them in [] so that it will not
be treated as special char.
Note: No need to use -E to use these regular expressions with grep. We have egrep and fgrep which are equal to “grep -E”.
I suggest you just concentrate on grep to complete your work, don’t go
for other commands if grep is there to resolve your issues. Stay tuned
to our next post on Regular expressions.