now vi the file /tmp/blah, and look for “, if you find one, there is an odd number of “s on that line, so note the line number and look in /tmp/broken.html on that line to see what the problem might be.

so just breaking down the two sed statements.

sed -e 's/[^"]//g'

This looks for all the characters that are not ” and replaces them with nothing (ie removes them). The g at the end means do it more than once per line. So after this point, we have just the ” from each line. An example of the output at this point would be

""
""""""
"""

Now we want to remove the even “s and so we pair them up and turn them into x’s. leaving the ” all by itself.

sed -e 's/""/xx/g'

This means when you find two quotes (“”) replace it with xx, and do this a number of times on the line (the g at the end).
So the result would then look like:

xx
xxxxxx
xx"

And we can see on the last line there is an unbalanced “.

This would work equallt well for single quotes, or anything else, as long as you escape it correctly, and that they appear on the same line.

This entry was posted by cameron on February 23, 2010 at 12:13 pm under web.
You can skip to the end and leave a response. Pinging is currently not allowed. Follow any responses to this entry through the RSS 2.0 feed.