Hi, Using only this data you posted, the following could do it for you:

Code

$str=~s{\s+?\((.+?)\)(\s+)?}{ '$1' }g;

You can

Code

use re 'debug';

at the top of your script to see what was going on. Hope this helps. *Update* In case you want the explanation of your regex you could see this module: YAPE::Regex::Explain. Below is how the above regex is explained:

Quote

this 'is' a test of <a te(st)ing> of a 'type' The regular expression:

(?-imsx:\s+?\((.+?)\)(\s+)?)

matches as follows:

NODE EXPLANATION ---------------------------------------------------------------------- (?-imsx: group, but do not capture (case-sensitive) (with ^ and $ matching normally) (with . not matching \n) (matching whitespace and # normally): ---------------------------------------------------------------------- \s+? whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- \( '(' ---------------------------------------------------------------------- ( group and capture to \1: ---------------------------------------------------------------------- .+? any character except \n (1 or more times (matching the least amount possible)) ---------------------------------------------------------------------- ) end of \1 ---------------------------------------------------------------------- \) ')' ---------------------------------------------------------------------- ( group and capture to \2 (optional (matching the most amount possible)): ---------------------------------------------------------------------- \s+ whitespace (\n, \r, \t, \f, and " ") (1 or more times (matching the most amount possible)) ---------------------------------------------------------------------- )? end of \2 (NOTE: because you are using a quantifier on this capture, only the LAST repetition of the captured pattern will be stored in \2) ---------------------------------------------------------------------- ) end of grouping ----------------------------------------------------------------------

Hmm, regexes might not be the right tool for this type of things with nested symbols such as <> [] {} () "" (at least not pure regexes). You can still handle very simple cases with regexes, but it quickly becomes unmanageable. You'll need to use a real parser or you can build yourself a simple finite state machine or automaton reading the input progressively and recording the current state at any point.

You could possibly perform this task in multiple steps ( three or four separate regexps ).

Of course, with real world data, you may bump into issues that haven't been represented by your example data, therefore I haven't supported i.e. only supports one set of parenthesis per < > wrapper, no support for unmatched parenthesis etc etc etc.

My preference would be to do this in Perl ;). The example below is rough / probably inefficient, but is provided to represent the comparatively ease of this approach. It is also likely to support real world data more desirably: