<b>example: </b>, example:
<div align="left">this is a test</div>, this is a test

PREG_OFFSET_CAPTURE

If this flag is passed, for every occurring match the appendant string
offset will also be returned. Note that this changes the value of
matches into an array where every element is an
array consisting of the matched string at offset 0
and its string offset into subject at offset
1.

If no order flag is given, PREG_PATTERN_ORDER is
assumed.

offset

Normally, the search starts from the beginning of the subject string.
The optional parameter offset can be used to
specify the alternate place from which to start the search (in bytes).

Note:

Using offset is not equivalent to passing
substr($subject, $offset) to
preg_match_all() in place of the subject string,
because pattern can contain assertions such as
^, $ or
(?<=x). See preg_match()
for examples.

Return Values

Returns the number of full pattern matches (which might be zero),
or FALSE if an error occurred.

Changelog

Version

Description

5.4.0

The matches parameter became optional.

5.3.6

Returns FALSE if offset
is higher than
subject length.

5.2.2

Named subpatterns now accept the
syntax (?<name>)
and (?'name') as well
as (?P<name>). Previous versions
accepted only (?P<name>).

<?php// The \\2 is an example of backreferencing. This tells pcre that// it must match the second set of parentheses in the regular expression// itself, which would be the ([\w]+) in this case. The extra backslash is// required because the string is in double quotes.$html = "<b>bold text</b><a href=howdy.html>click me</a>";

Recently I had to write search engine in hebrew and ran into huge amount of problems. My data was stored in MySQL table with utf8_bin encoding.

So, to be able to write hebrew in utf8 table you need to do<?php$prepared_text = addslashes(urf8_encode($text));?>

But then I had to find if some word exists in stored text. This is the place I got stuck. Simple preg_match would not find text since hebrew doesnt work that easy. I've tried with /u and who kows what else.

Here is a way to match everything on the page, performing an action for each match as you go. I had used this idiom in other languages, where its use is customary, but in PHP it seems to be not quite as common.

// Update offset to the end of the match$offset = $match_start + $match_length; }

return $match_count;}?>

Note that the offsets returned are byte values (not necessarily number of characters) so you'll have to make sure the data is single-byte encoded. (Or have a look at paolo mosna's strByte function on the strlen manual page).I'd be interested to know how this method performs speedwise against using preg_match_all and then recursing through the results.

Here's some fleecy code to 1. validate RCF2822 conformity of address lists and 2. to extract the address specification (the part commonly known as 'email'). I wouldn't suggest using it for input form email checking, but it might be just what you want for other email applications. I know it can be optimized further, but that part I'll leave up to you nutcrackers. The total length of the resulting Regex is about 30000 bytes. That because it accepts comments. You can remove that by setting $cfws to $fws and it shrinks to about 6000 bytes. Conformity checking is absolutely and strictly referring to RFC2822. Have fun and email me if you have any enhancements!

I found simpleXML to be useful only in cases where the XML was extremely small, otherwise the server would run out of memory (I suspect there is a memory leak or something?). So while searching for alternative parsers, I decided to try a simpler approach. I don't know how this compares with cpu usage, but I know it works with large XML structures. This is more a manual method, but it works for me since I always know what structure of data I will be receiving.

Essentially I just preg_match() unique nodes to find the values I am looking for, or I preg_match_all to find multiple nodes. This puts the results in an array and I can then process this data as I please.

I was unhappy though, that preg_match_all() stores the data twice (requiring twice the memory), one array for all the full pattern matches, and one array for all the sub pattern matches. You could probably write your own function that overcame this. But for now this works for me, and I hope it saves someone else some time as well.

;Please note that if you set this value to a high number you may consume all;the available process stack and eventually crash PHP (due to reaching the;stack size limit imposed by the Operating System).

I have written this example mainly to demonstrate the power of PCRE LANGUAGE, not the power of it's implementation :)

As I intended to create for my own purpose a clean PHP class to act on XML files, combining the use of DOM and simplexml functions, I had that small problem, but very annoying, that the offsets in a path is not numbered the same in both.

That is to say, for example, if i get a DOM xpath object it appears like:
/ANODE/ANOTHERNODE/SOMENODE[9]/NODE[2]
and as a simplexml object would be equivalent to:
ANODE->ANOTHERNODE->SOMENODE[8]->NODE[1]

So u see what I mean? I used preg_match_all to solve that problem, and finally I got this after some hours of headlock (as I'm french the names of variables are in French sorry), hoping it could be useful to some of you:

<?php
function decrease_string($string)
{
/* retrieve all occurrences AND offsets of numbers in the original string: */