I tried the answer by Frantique it gave http://url.com not url.com
–
TarunFeb 17 '14 at 15:04

1

@Tarun Yes, I just wanted to say that there's no need to double reverse the text.
–
Eric CarvalhoFeb 17 '14 at 15:12

1

When you want to match something with / in sed, you should usually use a different delimiter, e.g. sed s@http://@@g.
–
KevinFeb 17 '14 at 21:18

2

This is very inefficient, though, solution 1 calls 5 processes over 4 pipes, and solution 2 calls 3 processes over 2 pipes including 2 regexes. This can all be done in the Bash shell without any pipes, processes or dependencies.
–
AsymLabsFeb 21 '14 at 0:10

Where file.in contains the 'dirty' url list and file.out will contain the 'clean' URL list. There are no external dependencies and there is no need to spawn any new processes or subshells. The original explanation and a more flexible script follows. There is a good summary of the method here, see example 10-10. This is pattern based parameter substitution in Bash.

Expanding on the idea:

src="define('URL', 'http://url.com');"
src="${src##*/}" # remove the longest string before and including /
echo "${src%%\'*}" # remove the longest string after and including '

Result:

url.com

No need to call any external programs. Furthermore, the following bash script, get_urls.sh, permits you to read a file directly or from stdin:

#!/usr/bin/env bash
# usage:
# ./get_urls.sh 'file.in'
# grep 'URL' 'file.in' | ./get_urls.sh
# assumptions:
# there is not more than one url per line of text.
# the url of interest is a simple one.
# begin get_urls.sh
# get_url 'string'
function get_url(){
local src="$1"
src="${src##*/}" # remove the longest string before and including /
echo "${src%%\'*}" # remove the longest string after and including '
}
# read each line.
while read line
do
echo "$(get_url "$line")"
done < "${1:-/proc/${$}/fd/0}"
# end get_urls.sh

Nice, +1. Strictly speaking though, there is a subshell, the while loop happens in a subshell. On the bright side, this works with just about any shell except [t]csh, so it's good for sh, bash, dash, ksh, zsh...
–
terdonFeb 25 '14 at 9:39

-P, --perl-regexp
Interpret PATTERN as a Perl regular expression (PCRE, see
below). This is highly experimental and grep -P may warn of
unimplemented features.
-o, --only-matching
Print only the matched (non-empty) parts of a matching line,
with each such part on a separate output line.

The trick is to use \K which, in Perl regex, means discard everything matched to the left of the \K. So, the regular expression looks for strings starting with http:// (which is then discarded because of the \K) followed by as many non-' characters as possible. Combined with -o, this means that only the URL will be printed.