This kind of code is too clever for its own good.
–
dkaminsMay 24 '11 at 21:53

I think it's neat. Although I'd wrap it in a function.
–
pillmuncherMay 24 '11 at 22:39

Question: does this code do replacement in a single pass? or is sub() called once per dictionary key-value pair?
–
zzzJan 28 '12 at 15:10

The replacement happens in a single pass.
–
Andrew ClarkJan 28 '12 at 18:29

4

dkamins: it’s not too clever, it’s not even as clever as it should be (we should regex-escape the keys before joining them with "|"). why isn’t that overengineered? because this way we do it in one pass (=fast), and we do all the replacements at the same time, avoiding clashes like "spamham sha".replace("spam", "eggs").replace("sha","md5") being "eggmd5m md5" instead of "eggsham md5"
–
flying sheepSep 4 '12 at 22:19

The order in which you apply the different replacements will matter - so instead of using a standard dict, consider using an OrderedDict - or a list of 2-tuples.
–
slothropMay 24 '11 at 21:43

3

This makes iterating the string twice... not good for performances.
–
Valentin LorentzAug 3 '12 at 7:26

3

Performance-wise it's worse than what Valentin says - it'll traverse the text as many times as there are items in dic! Fine if 'text' is small but, terrible for large text.
–
JDonnerNov 17 '12 at 1:40

1

This is a good solution for some cases. For example, I just want to sub 2 characters and I don't care about the order they go in because the substitution keys don't match any values. But I do want it to be clear what's happening.
–
Nathan GarabedianJun 2 '13 at 14:21

1

Note that this may give unexpected results because the newly inserted text in the first iteration can be matched in the second iteration. For example, if we naively try to replace all 'A' with 'B' and all 'B' with 'C', the string 'AB' would be transformed into 'CC', and not 'BC'.
–
Ambroz BizjakOct 21 '13 at 10:47

Would be simpler to make repls a sequence of tuples and do away with the iteritems() call. i.e. repls = ('hello', 'goodbye'), ('world', 'earth') and reduce(lambda a, kv: a.replace(*kv), repls, s). Would also work unchanged in Python 3.
–
martineauDec 5 '13 at 3:28

Note that since replacement is done in just one pass, "café" changes to "tea", but it does not change back to "café".

If you need to do the same replacement many times, you can create a replacement function easily:

>>> my_escaper = multiple_replacer(('"','\\"'), ('\t', '\\t'))
>>> many_many_strings = (u'This text will be escaped by "my_escaper"',
u'Does this work?\tYes it does',
u'And can we span\nmultiple lines?\t"Yes\twe\tcan!"')
>>> for line in many_many_strings:
... print my_escaper(line)
...
This text will be escaped by \"my_escaper\"
Does this work?\tYes it does
And can we span
multiple lines?\t\"Yes\twe\tcan!\"

While this is a good solution, concurrent string replacements won't give precisely the same results as performing them sequentially (chaining) them would -- although that may not matter.
–
martineauDec 5 '13 at 4:05

Can you explain it better and show an example?
–
mmjDec 5 '13 at 13:52

1

Sure, with rep_dict = {"but": "mut", "mutton": "lamb"} the string "button" results in "mutton" with your code, but would give "lamb" if the replacements were chained, one after the other.
–
martineauDec 5 '13 at 14:35

1

That is the main feature of this code, not a defect. With chained replacements it could not achieve the desired behaviour of substituting two words simultaneously and reciprocally like in my example.
–
mmjDec 5 '13 at 22:55

It doesn't seems like a great feature, I Agree with martineau..
–
KP25Dec 22 '13 at 6:56

The point is in avoiding many concatenations of long strings. We chop the source string to fragments, replacing some of the fragments as we form the list, and then join the whole thing back into a string.

I needed a solution where the strings to be replaced can be a regular expressions,
for example to help in normalizing a long text by replacing multiple whitespace characters with a single one. Building on a chain of answers from others, including MiniQuark and mmj, this is what I came up with:

Note that it does not chain the replacements, instead performs them simultaneously. This makes it more efficient without constraining what it can do. To mimic the effect of chaining, you may just need to add more string-replacement pairs and ensure the expected ordering of the pairs:

Starting from the precious answer of Andrew i developed a script that loads the dictionary from a file and elaborates all the files on the opened folder to do the replacements. The script loads the mappings from an external file in which you can set the separator. I'm a beginner but i found this script very useful when doing multiple substitutions in multiple files. It loaded a dictionary with more than 1000 entries in seconds. It is not elegant but it worked for me