Activity

Thanks for the patch, Dave. It is Rich Hickey's policy only to include code in Clojure written by those who have signed a Contributor Agreement (CA). See here for more details: http://clojure.org/contributing Have you signed one, or were considering it?

Andy Fingerhut
added a comment - 19/Jul/12 11:46 AM Thanks for the patch, Dave. It is Rich Hickey's policy only to include code in Clojure written by those who have signed a Contributor Agreement (CA). See here for more details: http://clojure.org/contributing Have you signed one, or were considering it?

Can someone find some documentation or spec somewhere that defines this \x.. format?

It is definitely different than the \x{...} syntax that exists in Perl, which permits one to insert an arbitrary Unicode character code point into a string (note: even supplementary ones that don't fit into a single UTF-16 code unit, as Java's and Clojure's \u.... is restricted to). http://perldoc.perl.org/perlunicode.html#Effects-of-Character-Semantics

Andy Fingerhut
added a comment - 19/Jul/12 3:57 PM Can someone find some documentation or spec somewhere that defines this \x.. format?
It is definitely different than the \x{...} syntax that exists in Perl, which permits one to insert an arbitrary Unicode character code point into a string (note: even supplementary ones that don't fit into a single UTF-16 code unit, as Java's and Clojure's \u.... is restricted to). http://perldoc.perl.org/perlunicode.html#Effects-of-Character-Semantics

I'm hitting this now as well. But, adding support for JavaScript's flavour of \x.. escapes to the Clojure reader makes no sense to me. If escapes are to be used, then the \u.... format seems preferable (it supersets \x..).

However, all of the readers in play (Clojure reader, ClojureScript reader, edn) all play nice with Unicode, so there's no reason to be escaping anything except for \t, \n, and so on.

It looks like tweaking cljs' string implementations of IPrintWithWriter and IPrintable so that only those characters are escaped would be fairly easy. Right now, they're using goog.string.escape, which "encloses a string in double quotes and escapes characters so that the string is a valid JS string"; whatever escaping is appropriate for a "valid JavaScript string" seems irrelevant to what e.g. pr-str should produce.

Chas Emerick
added a comment - 19/Oct/12 8:10 AM I'm hitting this now as well. But, adding support for JavaScript's flavour of \x.. escapes to the Clojure reader makes no sense to me. If escapes are to be used, then the \u.... format seems preferable (it supersets \x..).
However, all of the readers in play (Clojure reader, ClojureScript reader, edn) all play nice with Unicode, so there's no reason to be escaping anything except for \t, \n, and so on.
It looks like tweaking cljs' string implementations of IPrintWithWriter and IPrintable so that only those characters are escaped would be fairly easy. Right now, they're using goog.string.escape, which "encloses a string in double quotes and escapes characters so that the string is a valid JS string"; whatever escaping is appropriate for a "valid JavaScript string" seems irrelevant to what e.g. pr-str should produce.
I propose closing this ticket and moving the party to CLJS.

Following Chas's lead and closing this one. \x doesn't appear in the JSON spec, and a quick search of StackOverflow shows people stumbling over it from a bunch of other language platforms. I think we should root it out of ClojureScript.

Stuart Halloway
added a comment - 19/Oct/12 1:55 PM Following Chas's lead and closing this one. \x doesn't appear in the JSON spec, and a quick search of StackOverflow shows people stumbling over it from a bunch of other language platforms. I think we should root it out of ClojureScript.

Re: "no reason to be escaping anything except for \t, \n": sometimes it is difficult or impossible to transmit all of Unicode (e.g. sending non-Character codepoints through XDomainRequest, or sending U+0000/U+FFFE/U+FFFF through many XHR implementations), so it might be nice to have an ASCII-only printing mode. Probably for another ticket, though.

Ivan Kozik
added a comment - 19/Oct/12 2:39 PM Re: "no reason to be escaping anything except for \t, \n": sometimes it is difficult or impossible to transmit all of Unicode (e.g. sending non-Character codepoints through XDomainRequest, or sending U+0000/U+FFFE/U+FFFF through many XHR implementations), so it might be nice to have an ASCII-only printing mode. Probably for another ticket, though.

@Ivan: I agree that options in this area would be good. There are a lot of edge cases where the defaults aren't right (e.g. I think escaping all nonprintables is a no-brainer for readably-printed strings).