is there a way to group a matching element but not have that match appear in the resulting match groups? for example, suppose I have a string with two lines:

<td>text 1</td>
<td><a href=whatever>this is</a> text 2</td>

and I want to parse out "text 1" and "this is text 2". what I'm doing now is using this pattern:

<td>(<a href=.+?>)?(.+?(</a>)?.+?)</td>

basically grouping the anchor tags so I can have the pattern match them zero or one time. I don't want those groups to appear in the match results (though I can easily ignore them). is there a proper way to do this?

thanks, that's what I need. but it looks like it doesn't do what I want if I nest a non-capturing group inside a capturing group...is that not possible?
–
toasterovenNov 20 '09 at 23:28

specifically for the second example, if I match with: <td>(?:<a href=.+?>)(.+?(?:</a>).+?)</td> it doesn't properly match the </a>
–
toasterovenNov 20 '09 at 23:30

In the regex in your comment, a href is not optional. Try <td>(?:<a href=.+?>)?(.+?(?:</a>)?.+?)</td> instead. BTW-- if you're parsing HTML, a regex is a pretty bad approach. Try this instead codeplex.com/htmlagilitypack
–
AndomarNov 20 '09 at 23:45