Dave Engberg
added a comment - 05/Oct/08 18:48 That sounds fine to me ... although the double-quotes would have to be escaped in the original Thrift IDL, so I assumed (incorrectly?) that this escaping would carry through to the generated code.

You have to use 'string with " double quote' and "string with ' quote" in your thrift file. You can't escape single or double quote with back slash or something like it. The literal token is described in thriftl.ll file:

Alexander Shigin
added a comment - 05/Oct/08 19:32 You have to use 'string with " double quote' and "string with ' quote" in your thrift file. You can't escape single or double quote with back slash or something like it. The literal token is described in thriftl.ll file:
dliteral ( "\" "[^" ]* "\" ")
sliteral ( "'" [^']* "'" )
I think it should be fixed.

Dave Engberg
added a comment - 05/Oct/08 19:38 Ah, ok. I've only used double-quotes for string constants in Thrift. In that case, it sounds like triple-quotes is the right fix for now, thanks.
So it sounds like there's no way to write a string constant in a Thrift IDL which contains both a single quote and a double quote character?

Alexander Shigin
added a comment - 05/Oct/08 19:55 It seems so.
And my patch should be reworked. It's uncommon to use triple single quote, but if anyone use it, he gets an error. I think it's better to replace ' to \' and use single quote as guard.
Your string should be escaped like: 'Can\'t touch this'.

The escaping in the Thrift file should be separate from the escaping of generated code. We should NOT be putting slashes into a .thrift file to account for how they'll be escaped in Python, for example.

The correct fix is to change the generators on a per-language basis to properly escape the strings that they are generating.

Mark Slee
added a comment - 06/Oct/08 19:22 The escaping in the Thrift file should be separate from the escaping of generated code. We should NOT be putting slashes into a .thrift file to account for how they'll be escaped in Python, for example.
The correct fix is to change the generators on a per-language basis to properly escape the strings that they are generating.
out << "'" << value->get_string() << "'";
+ out << "'''" << value->get_string() << "'''";
This should look like:
+ out << "'''" << this->escape_py_literal(value->get_string()) << "'''";
And the method escape_*_literal should take care of proper escaping on a per-language basis.

Mark, I think that Dave points out a flaw that should be addressed, though. We do need some kind of escaping in Thrift IDL or we can't represent strings with both single quotes and double quotes in them. However, as you correctly point out, this escaping should be separate from the escaping needs of the target languages.

My recommendation:
1. Beef up the IDL parser to allow some form of slash escaping in the constant strings. This would not be fully backwards compatible since existing '\' in strings would need to be escaped. The parser would unescape these characters as they are read in.
2. Have the compiler perform the appropriate language-specific escaping as you suggest above.

Chad Walters
added a comment - 06/Oct/08 19:38 Mark, I think that Dave points out a flaw that should be addressed, though. We do need some kind of escaping in Thrift IDL or we can't represent strings with both single quotes and double quotes in them. However, as you correctly point out, this escaping should be separate from the escaping needs of the target languages.
My recommendation:
1. Beef up the IDL parser to allow some form of slash escaping in the constant strings. This would not be fully backwards compatible since existing '\' in strings would need to be escaped. The parser would unescape these characters as they are read in.
2. Have the compiler perform the appropriate language-specific escaping as you suggest above.

Alexander Shigin
added a comment - 25/Mar/09 16:53 The patch adds escaping method to generator and fixes lexer to allow strings like 'I don\'t know'. Can anybody check if I escape strings right?
I don't know if t_const_value::get_escaped_string is a right decision.

Awesome patch. I made a bunch of small changes like adding escapes, fixing a few generators, making the call sites more uniform, etc. I also changed the way get_escaped_string is called. I like this way better, and you seemed unsure about the old way too. I also eliminated your vim comment. My guess is that you had the wrong filetype set because Vim defaults to "LifeLines" for ".ll", even though it concedes that it is also used for Lex for C++. Let me know if the modeline patch doesn't fix it.

David Reiss
added a comment - 25/Mar/09 21:23 Awesome patch. I made a bunch of small changes like adding escapes, fixing a few generators, making the call sites more uniform, etc. I also changed the way get_escaped_string is called. I like this way better, and you seemed unsure about the old way too. I also eliminated your vim comment. My guess is that you had the wrong filetype set because Vim defaults to "LifeLines" for ".ll", even though it concedes that it is also used for Lex for C++. Let me know if the modeline patch doesn't fix it.

Oh, there is something wrong with me in past two days. I've missed your comments with new patch at all David, sorry.

I'll check the modeline, because I do hate the "comment fix". Perl '\n' was a big surprise for me. The absent of test/const-check is even worse. I haven't made one . It's hard for me to create test suite because I have to read a manual about automake, autoconf, etc...

Your decision to make get_escaped_string is better than send this to const_value.

Alexander Shigin
added a comment - 26/Mar/09 20:42 Oh, there is something wrong with me in past two days. I've missed your comments with new patch at all David, sorry.
I'll check the modeline, because I do hate the "comment fix". Perl '\n' was a big surprise for me. The absent of test/const-check is even worse. I haven't made one . It's hard for me to create test suite because I have to read a manual about automake, autoconf, etc...
Your decision to make get_escaped_string is better than send this to const_value.
Here a couple strange things in your patch:
we don't need t_php_generator::escape_string anymore;
why do you escape ' in html generator?
So, if escaping is right it's lgtm.