Now this code works perfect, the problem is that the keyword_hash has
more
than 300 elements and running this code can take between 50 to 120
seconds.
Since I am processing more than 1000 pages with this code it takes
forever.

I solved this problem by replacing the regular expression match to

r1 = Regexp.new("#{key1}.#{key2}")
r2 = Regexp.new("#{key2}.#{key1}")

return id if long_text =~ r1 or long_text =~ r2

I simply put the or statement outside the regular expresion and the
speedup
was from 50~120sec to 0.40 secs per page.

code it takes forever.
I simply put the or statement outside the regular expresion and the

the speed difference is totally diferent.

Is this expected when using regular expressions??

On obvious optimization is to create all regexps during
load_keyword_array_from_database() and not during iteration of the hash.
That way you just have to do it once and can reuse those regexps with
multiple pages you check.

Another possible optimization is to take your approach of splitting the
regexps a bit further and create two regexps - one for each keyword -
and
return the id if both match. This works only correctly if (i) keywords
don’t overlap or (ii) you can use \b to ensure matching on word
boundaries.