Thomas Jollans wrote:
> On Wednesday 18 August 2010, it occurred to Brandon Harris to exclaim:
>> Having trouble using %s with re.sub
>>>> test = '/my/word/whats/wrong'
>> re.sub('(/)word(/)', r'\1\%s\2'%'1000', test)
>>>> return is /my/@0/whats/wrong
>>>> This has nothing to do with %, of course:
>>>>> re.sub('(/)word(/)', r'\1\%d\2'%1000, test)
> '/my/@0/whats/wrong'
>>>> re.sub('(/)word(/)', r'\1\1000\2', test)
> '/my/@0/whats/wrong'
>> let's see if we can get rid of that zero:
>>>>> re.sub('(/)word(/)', r'\1\100\2', test)
> '/my/@/whats/wrong'
>> so '\100' appears to be getting replaced with '@'. Why?
>>>>> '\100'
> '@'
>> This is Python's way of escaping characters using octal numbers.
>>>>> chr(int('100', 8))
> '@'
>> How to avoid this? Well, if you wanted the literal backslash, you'll need to
> escape it properly:
>>>>> print(re.sub('(/)word(/)', r'\1\\1000\2', test))
> /my/\1000/whats/wrong
>> If you didn't want the backslash, then why on earth did you put it there? You
> have to be careful with backslashes, they bite ;-)
>> Anyway, you can simply do the formatting after the match.
>>>>> re.sub('(/)word(/)', r'\1%d\2', test) % 1000
> '/my/1000/whats/wrong'
>> Or work with match objects to construct the resulting string by hand.
>You can stop group references which are followed by digits from turning
into octal escapes in the replacement template by using \g<n> instead:
>>> print r'\1%s' % '00'
\100
>>> print r'\g<1>%s' % '00'
\g<1>00