Hi Tomas,
Thanks for the report: this is an important area indeed.
However, the comment says it: this is intentional, and as far as I know, it is correct. You see, Zinc will correctly decode/encode any HTTP payload, using the supplied mime-type and/or charsets. Seaside is written such that it does not want this: it insists on doing this on its own. That is why ZnZincServerAdaptor is _not_ using the normal request reading code of Zn, but uses a special option to read everything binary. The stupid thing is that even though Seaside needs bytes, it wants them as a String. That is the reason for the otherwise brain dead #asString (and the implicit copy is inefficient as well).
Of course, you will see your special characters there if you do UTF8 decoding, but that is because you already know what is inside.
What normally happens, is that later on in the processing, Seaside will access the WARequest payload using proper decoding, using its own framework (much like what Zn would do). AFAIK this whole process works. You can actually test this using some of the functional tests.
I am not sure that Seaside-REST is doing the right thing (there were some issues with SmalltalkHub as well), but I would think so.
Are you sure you have set the correct encoding on the adaptor ?
Are you sure you are posting as application/json;charset=utf-8 and if you do not set the charset, are you sure utf-8 is the default ?
Are you sure your REST handler and/or JSON parser does the right thing ?
It is too late right now, but if we want to get further with this, I will need a failing unit test - if these exist in Seaside-REST, but I would assume so. I have no experience running Seaside-REST, I am using Zinc-REST myself, but I would like to learn.
Regards,
Sven
On 25 Jun 2013, at 23:24, Tomas Kukol <tomas.kukol at gmail.com> wrote:
> Hi Sven.
>> I've had a problem when POSTing non-ascii UTF-8 characters in JSON to Seaside REST service. I've located the problem in the method ZnZincServerAdaptor>>requestBodyFor: where the body of ZnRequest is translated to body of WARequest. I use Pharo 1.4 with Seaside 3.0.8 and Zinc-Seaside-SvenVanCaekenberghe.40.
>> When the POSTed JSON contains non-ascii UTF-8 characters (Czech characters), they are corrupted. The problem is on the "MARKED" line, where the array of bytes changed to string by asString.
>> "Problematic" code:
>> ZnZincServerAdaptor>>requestBodyFor: aZincRequest
> ^ (aZincRequest method ~= #TRACE
> and: [ aZincRequest hasEntity
> and: [ aZincRequest entity isEmpty not
> and: [ (aZincRequest entity contentType matches: ZnMimeType applicationFormUrlEncoded) not
> and: [ (aZincRequest entity contentType matches: ZnMimeType multiPartFormData) not ] ] ] ])
> ifTrue: [
> "Seaside wants to do its own text conversions"
> aZincRequest entity bytes asString "MARKED" ]
> ifFalse: [
> String new ]
>> I did a quick correction, which is not nice, but works for me:
>> ZnZincServerAdaptor>>requestBodyFor: aZincRequest
> ^ (aZincRequest method ~= #TRACE
> and: [ aZincRequest hasEntity
> and: [ aZincRequest entity isEmpty not
> and: [ (aZincRequest entity contentType matches: ZnMimeType applicationFormUrlEncoded) not
> and: [ (aZincRequest entity contentType matches: ZnMimeType multiPartFormData) not ] ] ] ])
> ifTrue: [
> "Seaside wants to do its own text conversions"
> ZnUTF8Encoder new decodeBytes: aZincRequest entity bytes "CORRECTED" ]
> ifFalse: [
> String new ]
>> My correction tries to decode byte array with ZnUTF8Encoder and the result is OK.
>> Maybe I would recommend to use GRPharoUtf8Codec (although I like ZnUTF8Encoder more) or even better self codec (self = ZnZincServerAdaptor) to try to decode the bytes.
>> Regards,
> Tomas Kukol
> _______________________________________________
> seaside mailing list
>seaside at lists.squeakfoundation.org>http://lists.squeakfoundation.org/cgi-bin/mailman/listinfo/seaside
--
Sven Van Caekenberghe
http://stfx.eu
Smalltalk is the Red Pill