Thank you Terry for taking the time to give a full explanation!
-----Original Message-----
From: Terry Reedy [mailto:tjreedy at udel.edu]
Sent: Tuesday, February 04, 2003 3:09 PM
To: python-list at python.org
Subject: Re: newbie raw text question
[post and cc]
"Ian Sparks" <Ian.Sparks at etrials.com> wrote in message
news:mailman.1044369911.27254.python-list at python.org...
>Thanks for the reply Dennis. Your breakdown of the meaning of the RTF
codes is >pretty-much spot on. However, I'm still not "getting it".
What you are not quite getting fully (and others have had the same
problem) is the difference between a string literal in your code and a
string object in your execution space. String literals are used to
give a sequence-of-bytes value to string objects, but they are not the
objects themselves. 'Rawness' is only a property of code literals,
but not of strings themselves, nor of non-code input, nor of output
text. And it only consists in the presence of the 'r' prefix.
In a sense, even 'raw literal' is a slight misnomer. All string
literals are 'raw' as written. The 'r' prefix is a "leave raw, do not
cook" directive to the interpreter. So a 'raw' literal is a literal
that is left raw when fed to the interpreter and the expressions it
appears in.
The convention could be the opposite -- that string literals be left
as they are (raw) unless tagged with a 'cook' directive. It is not, I
presume, because of the fairly frequent use, in some situations, of
'\n' and possibly '\t'. The current situation is analogous to the
following: if I say "Feed X an egg" and X is human, I probably mean
"Feed X an (initially raw) egg that is cooked in the 'standard'
manner". I would have to say "Feed X a raw egg" to disable the usual
processing.
As for output: let s be a string. Then
'file.write(s)' writes the bytes of s to the device that 'file'
represents exactly as they are, with the possible exception of
'text-mode' expansion of '\n' (but that is separate issue).
'str(s)', which 'print s' uses, produces a 'friendly' graphical
representation
'repr(s)' (== `s`), which the interpreter uses to echo expressions in
interactive mode, produces an exact graphical representation that
'eval()'s back to s. IE,
eval(repr(s)) == s. repr() does not use 'r' prefixes (which are a
somewhat recent addition to the language. However, it could, and, if
it were not for the problem of breaking code that depends on the exact
current behavior, I might even think it should.
Pending a change, you could write your own function: here is a start
(untested)
def r_rep(s): # modify repr() to use 'r' prefix when possible and
useful
rep = repr(s)
return r_able(rep) and 'r'+rep.replace(r'\\', '\\') or rep
where r_able(rep), left as an exercise for you, checks that rep
a) contains backslashes (so that 'r' prefixing is useful)
b) only has even-numbered backslash sequences (so all can be
undoubled)
c) ends with a multiple-of-four (possibly 0) backslash sequence (so
there will still be an even number after halving)
Terry J. Reedy
--
http://mail.python.org/mailman/listinfo/python-list