Some comments:
- As James Robinson said in bug 74034, json response type should not work with sync requests (unless in a worker).
- As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.
- Should the JSON object really be a ScriptValue member in XMLHttpRequest object? Keeping it in the wrapper seems quite weird. Our only instance of such was introduced in bug 34311 for MessageEvent, and that had no rationale or discussion.
- We need to estimate JSON object size to report it via reportExtraMemoryCost() in XMLHttpRequest::dropProtection(). In general, effects on garbage collection will need to be reviewed by a JavaScriptCore expert.
- You changed responseText to work with "json" response type. That's not what current XHR draft calls for:
If responseType is not the empty string or "text", throw an "InvalidStateError"
exception and terminate these steps."
- We need much more test coverage, covering issues like the one above. Basically, any normative text in the spec that mentions responseType or json needs to be covered.
- You copied code in JSXMLHttpRequest::response() that raises an exception. Looking at the spec, accessing this attribute never raises an exception. Am I missing something, or does WebKit not match the spec here? If so, why?

Thanks Alexey - let me address each of your comments inline below. But to preface, I had planned to have this conversation about the spec w/ Annevk and others outside of this bug before anything is reviewed/landed into WebKit.
(In reply to comment #8)
> Some comments:
>
> - As James Robinson said in bug 74034, json response type should not work with sync requests (unless in a worker).
There is a separate bug for this: https://bugs.webkit.org/show_bug.cgi?id=72154 They should all occur in one fell swoop under the umbrella of that bug, and not cross-pollinate into this bug IMHO.
>
> - As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.
> - You changed responseText to work with "json" response type. That's not what current XHR draft calls for:
>
> If responseType is not the empty string or "text", throw an "InvalidStateError"
> exception and terminate these steps."
>
> - You copied code in JSXMLHttpRequest::response() that raises an exception. Looking at the spec, accessing this attribute never raises an exception. Am I missing something, or does WebKit not match the spec here? If so, why?
The attached patch diverges from the spec only because I plan to argue that the spec is not "useful" as it's currently written. I guess I will make that argument here, now:
I believe the plain text (responseText) ought to be still accessible when the type is "json". For legacy reasons, it is accessible for "document" type as well even though that's against the spec. I didn't see mention of that spec inconsistency here; but I agree with its behavior as it is in WebKit.
I write very complex web applications regularly, and I can tell you with conviction that only receiving a "null" response when the JSON is invalid is beyond useless. Often times the reaction, particularly during development/debugging an app, is to show/report the bad data payload. Without having the plain text available, this is impossible. There would be no reason for anyone to actually use the responseType of "json" when they can get the text and JSON.parse it themselves.
With that in mind, re-use of responseText was practical when retrieving the JSON to be parsed. If responseText was off limits, then a new accessor "responseJSONText" or what-have-you would be needed, which does the exact same thing as responseText but checks different responseTypes.
To be consistent with our own implementation, we throw the invalid state error when accessing an attribute incongruent with the set responseType. In fact, we throw exceptions in *a lot* of areas in XHR that are inconsistent with the spec. I brought this up nearly a year ago (https://bugs.webkit.org/show_bug.cgi?id=54162) but it's fallen off your radar it seems. If you do care about being consistent with the spec that much, let's revisit that bug too ;)
>
> - Should the JSON object really be a ScriptValue member in XMLHttpRequest object? Keeping it in the wrapper seems quite weird. Our only instance of such was introduced in bug 34311 for MessageEvent, and that had no rationale or discussion.
Oliver and I discussed this a bit, as did Adam Roben and I. Apparently holding onto a ScriptValue could be problematic for the GC. I'll defer to them on this; but caching in the wrapper does give JSC the opportunity to have it marked in visitChildren which I added in this patch.
On that note of caching, V8 bindings, being purely static, have no real easy way of doing caching in themselves. A cached ScriptValue in XHR would be ideal and work for both JSC and V8; that would lead to a responseJSON accessor, etc. etc. But if there are GC ramifications...
>
> - We need to estimate JSON object size to report it via reportExtraMemoryCost() in XMLHttpRequest::dropProtection(). In general, effects on garbage collection will need to be reviewed by a JavaScriptCore expert.
See above, visitChildren, etc.
>
> - We need much more test coverage, covering issues like the one above. Basically, any normative text in the spec that mentions responseType or json needs to be covered.
Given the above explanation, the new test case covers the issues fairly well. It tests good JSON payload, bad JSON payload, attempts to change the responseType at the wrong time, validates the responseType is set to "json", validates the xhr.responseText + JSON.parse === xhr.response, etc. It closely resembles existing XHR tests.
So I hope I addressed your comments thoroughly enough. Just because XYZ person said so or just because the spec says so doesn't make it the right/best approach. This needs further discussion, clearly.

One more note regarding caching a ScriptValue: the only logical place to perform the VM specific parsing is in the bindings. So XHR would need a "setResponseJSON" method to cache the value. Unlike the other response accessors, which have the data already in XHR, this scenario is not quite the same. So it could be argued that since the bindings parse the value, they ought to cache it too rather than doing a back-and-forth with XHR. That still leaves the issue of V8 bindings + caching though.
To avoid the back-and-forth, I had asked on webkit-dev about creating a parser interface that would allow XHR to perform the parse action, analogous to ScriptValue abstracting values. This still begs the question of GC issues, etc. It would be nice to have, but only if we see it being used in several areas of WebKit...otherwise it would just be over-engineering I think.

(In reply to comment #8)
> Some comments:
>
> - As James Robinson said in bug 74034, json response type should not work with sync requests (unless in a worker).
>
> - As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.
That means xhr.jsonResponse != xhr.jsonResponse -- we don't do this for any other response types and if we consider this to be okay then there no reason for any caching to occur as we're guaranteeing that you can only get the jsontext once -- throwing away the string also prevents different language bindings from being able to use the same xhr.
>
> - Should the JSON object really be a ScriptValue member in XMLHttpRequest object? Keeping it in the wrapper seems quite weird. Our only instance of such was introduced in bug 34311 for MessageEvent, and that had no rationale or discussion.
Keeping it as a ScriptValue would break GC as it leads to a hard cycle, it also doesn't work with any other bindings, so accessing an xhr from the objc or object bindings wouldn't produce a useful value.
> - We need to estimate JSON object size to report it via reportExtraMemoryCost() in XMLHttpRequest::dropProtection(). In general, effects on garbage collection will need to be reviewed by a JavaScriptCore expert.
The JSON Object is a raw JS object, we already know how much it costs -- reportExtraMemoryCost should only be used if you have made a JS object that is keeping a large amount of non-JS data live.

> But to preface, I had planned to have this conversation about the spec w/ Annevk and others outside of this bug before anything is reviewed/landed into WebKit.
That makes good sense. The patch was marked for review yesterday, so this wasn't clear.
> There is a separate bug for this: https://bugs.webkit.org/show_bug.cgi?id=72154 They should all occur in one fell swoop under the umbrella of that bug, and not cross-pollinate into this bug IMHO.
I'm not sure if I agree. Vendors should preferably not ship this new feature enabled for sync requests, and one cannot be sure that no port will pick up ToT and ship in intermediate state. Perhaps this bug should be blocked by bug 72154, so that we could figure out the details of disabling response types separately indeed.
Much less desirably, this could be landed under an ENABLE flag.
> There would be no reason for anyone to actually use the responseType of "json" when they can get the text and JSON.parse it themselves.
Is there any reason for this responseType existence, other than some syntactic sugar?
Using XHR to both retrieve and parse responses is a point of contention. I don't know what the generally accepted thinking about this is now (and if there is such), but it's probably targeted at most simple use cases, where you don't care about errors. Even for XML, which is the most established of formats parsed by XHR, error reporting via responseXML is fragile at best.
> With that in mind, re-use of responseText was practical when retrieving the JSON to be parsed. If responseText was off limits, then a new accessor "responseJSONText" or what-have-you would be needed, which does the exact same thing as responseText but checks different responseTypes.
Another way to resolve this would be to keep (and provide) text response when parsing fails only. Public-webapps is of course the place to discuss this.
> So I hope I addressed your comments thoroughly enough.
Yes, thank you.
> - As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.> throwing away the string also prevents different language bindings from being able to use the same xhr.
What scenario do you foresee for using XMLHttpRequest object properties from different bindings (or even from several JS worlds)? Is that even possible today?

> That makes good sense. The patch was marked for review yesterday, so this wasn't clear.
Yeah, that's only because I'm lazy and was using `webkit-patch upload` which automatically marks the patch for review ;)
> > There is a separate bug for this: https://bugs.webkit.org/show_bug.cgi?id=72154 They should all occur in one fell swoop under the umbrella of that bug, and not cross-pollinate into this bug IMHO.
>
> I'm not sure if I agree. Vendors should preferably not ship this new feature enabled for sync requests, and one cannot be sure that no port will pick up ToT and ship in intermediate state. Perhaps this bug should be blocked by bug 72154, so that we could figure out the details of disabling response types separately indeed.
This sounds fine to me. I wholeheartedly agree with limiting the new response types to async requests.
> Much less desirably, this could be landed under an ENABLE flag.
We have some things to talk about before this is landed, so there's no hurry. Let's avoid the guard.
> > There would be no reason for anyone to actually use the responseType of "json" when they can get the text and JSON.parse it themselves.
>
> Is there any reason for this responseType existence, other than some syntactic sugar?
Well yeah, it allows XHR to be polymorphic in how it buffers, decodes, and delivers the data through one property (.response). I think of it as a lazy way to reuse one object to perform more than one type of data request.
I'll add the block from 72154 and we'll continue the discussion re: responseText on public-webapps. Relevant public-webapps thread: http://lists.w3.org/Archives/Public/public-webapps/2011OctDec/1440.html In the meantime I will get chromium working properly and passing the tests. All other JSC-based ports are rocking & rolling though with the current attached implementation.

Given that Firefox and soon Opera will be strict on responseType behavior WRT access to responseText, the consensus seems to be that the spec will stay put and we should follow suit.
There are a lot of spec inconsistencies in XHR that I plan to to correct, one small piece at a time, with a deliverable of an XHR implementation that's completely to-spec. This bug will be under that umbrella to fix it all up.

(In reply to comment #11)
> (In reply to comment #8)
> > - As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.
>
> That means xhr.jsonResponse != xhr.jsonResponse -- we don't do this for any other response types and if we consider this to be okay then there no reason for any caching to occur as we're guaranteeing that you can only get the jsontext once -- throwing away the string also prevents different language bindings from being able to use the same xhr.
I'm not sure I understand what you're saying. As it's currently specc'ed there is no way to get the raw json-text. You can only get the parsed json. For response types other than "text" and "", responseText and responseXML both throw, so there's no need to keep the source string if it's not needed by "response".
I don't see why we'd want other language bindings to behave differently from the JS bindings.
I like the idea of exposing responseText in the case where json parsing fails though. As Alexey said, dealing with the cases of invalid json should probably be brought up on public-webapps.

(In reply to comment #15)
> (In reply to comment #11)
> > (In reply to comment #8)
> > > - As Ojan Vafai said in the same bug, we should throw away the source string when we're done to save memory.
> >
> > That means xhr.jsonResponse != xhr.jsonResponse -- we don't do this for any other response types and if we consider this to be okay then there no reason for any caching to occur as we're guaranteeing that you can only get the jsontext once -- throwing away the string also prevents different language bindings from being able to use the same xhr.
>
> I'm not sure I understand what you're saying. As it's currently specc'ed there is no way to get the raw json-text. You can only get the parsed json. For response types other than "text" and "", responseText and responseXML both throw, so there's no need to keep the source string if it's not needed by "response".
>
> I don't see why we'd want other language bindings to behave differently from the JS bindings.
We wouldn't have bindings behave differently -- other language bindings would be expected to cache in their own binding, just as with JS. If the string is thrown away only one binding can ever use the json response (remembering the what json deserialises to is binding dependent). There is a possibility that the first call to jsonResponse parse the json to an object graph, convert that to SerializedScriptValue, and the bindings report the type of jsonResponse as SerializedScriptValue. That would have appropriate caching in both V8 and JSC bindings (unsure about others), and would allow the impl class to throw away the source string. Though of course at that point you're rather hoping that the source rep is smaller than the serialized representation.
>
> I like the idea of exposing responseText in the case where json parsing fails though. As Alexey said, dealing with the cases of invalid json should probably be brought up on public-webapps.
Agreed.

I didn't have a chance to look at the patch yet, but removing "use namespace JSC" is the wrong solution for build issues. If we didn't control JavaScriptCore, we could just put a prefix in this specific place, not everywhere.
But we control JavaScriptCore, so we can fix it to avoid namespace collisions.
It's also possible that one should never include both JSValueRef.h and JSType.h in one file for some reason (that I can't think of).
Please talks to folks who added these enums to choose the best solution.

(In reply to comment #22)
> I didn't have a chance to look at the patch yet, but removing "use namespace JSC" is the wrong solution for build issues.
That doesn't make much sense, considering it's 1) a minor change, 2) in one isolated custom binding file, and 3) our generated binding files don't put "use namespace JSC;" If it's so wrong, why aren't we generating our code as such?
> If we didn't control JavaScriptCore, we could just put a prefix in this specific place, not everywhere.
>
> But we control JavaScriptCore, so we can fix it to avoid namespace collisions.
Well we can't change the public API for binary compatibility, and it's completely rash to change JSC::JSType. I don't think either is an option, let alone a good option.
>
> It's also possible that one should never include both JSValueRef.h and JSType.h in one file for some reason (that I can't think of).
JavaScriptCore/runtime/Error.h eventually includes JSType.h. I can't exactly say why Cygwin complains but the other compilers (including my local clang) do not. I'd of expected them all to TBH.
With that said, this is obviously a mix of JSC Public C API with its C++ API. I can investigate if there's an equivalent C++ method to parse the JSON text (last I checked, there wasn't a strict JSON parser exported, just the regular ol' JS interpreter). If anyone has that answer off the bat, please do share :)
But considering we are consuming both the C and C++ APIs, I would argue that the "rule of thumb" doesn't apply here and it's completely legit to require fully qualifying the JSC namespace the way I am.

(In reply to comment #26)
> We use "using" in manually written .cpp files. Doing a wrong thing in one isolated file doesn't make it right.
We as in Apple? Or JSC (binding) maintainers? V8 manual files don't use using statements.

Comment on attachment 121109[details]
Patch
View in context: https://bugs.webkit.org/attachment.cgi?id=121109&action=review> Source/JavaScriptCore/runtime/JSONObject.cpp:870
> +EncodedJSValue JSONParse(ExecState* exec, UString json)
This should be "const UString&".
> Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:168
> + if (JSValue cachedValue = m_response.get())
> + return cachedValue;
A more conventional way to write this would be
if (m_response)
return m_response.get();
Perhaps you wrote it differently to hint at the fact that m_response is not actually m_response, but m_cachedJSONResponse? It's unfortunate that code generator makes you use a misleading name.
> Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:227
> + const_cast<JSXMLHttpRequest*>(this)->m_response.set(exec->globalData(), this, value);
You could make m_response mutable instead of casting.
> Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:231
> + // Once we have parsed the response text and have a cached result, we can clear the
> + // response text buffer to conserve memory.
> + impl()->clearResponseTextBuffer();
It's a shame that we have to add so much logic to bindings, and even add multiple accessors just to make that possible. This is against the general direction of abstracting out bindings differences and having logic in DOM code.
> Source/WebCore/xml/XMLHttpRequest.cpp:293
> +String XMLHttpRequest::responseJSONText(ExceptionCode& ec)
As mentioned above, it's unfortunate to do things so much differently in this single case. We don't have responseHTMLDocumentText or responseArrayBufferText.
> LayoutTests/ChangeLog:14
> + * fast/xmlhttprequest/resources/xmlhttprequest-responsetype-json.json: Added.
> + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-invalid-expected.txt: Added.
> + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-invalid.html: Added.
> + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid-expected.txt: Added.
> + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid.html: Added.
You rightfully disobey the spec requirement that UTF-8 should be used regardless of what the server specifies. I'd have added a regression test for that, too.
Payload with a null byte somewhere should ideally be tested, too.
> LayoutTests/fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid.html:18
> + shouldBe('xhrJson.response', 'xhrJson.response');
This doesn't seem to be what the spec says now, even though it's what it should say. A more dramatic way to test for cachedness would be:
xhrJson.response["foo"] = "bar";
shouldBe("xhrJson.response['foo']", "'bar'");
This is how it works for responseXML in Safari and Firefox at least.

(In reply to comment #33)
> (From update of attachment 121109[details])
> View in context: https://bugs.webkit.org/attachment.cgi?id=121109&action=review
>
> > Source/JavaScriptCore/runtime/JSONObject.cpp:870
> > +EncodedJSValue JSONParse(ExecState* exec, UString json)
>
> This should be "const UString&".
Truuuue
>
> > Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:168
> > + if (JSValue cachedValue = m_response.get())
> > + return cachedValue;
>
> A more conventional way to write this would be
>
> if (m_response)
> return m_response.get();
>
> Perhaps you wrote it differently to hint at the fact that m_response is not actually m_response, but m_cachedJSONResponse? It's unfortunate that code generator makes you use a misleading name.
Yeah this is sort of "self documentation" particularly because of the generated member var name. It's also in line with what the CachedAttribute will generate for non custom accessors.
>
> > Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:227
> > + const_cast<JSXMLHttpRequest*>(this)->m_response.set(exec->globalData(), this, value);
>
> You could make m_response mutable instead of casting.
Since m_response is generated, that would require a change to the generator script which sounds fine to me; a cache member var would imply it would be mutated in the accessor even though the accessor is generated as const.
>
> > Source/WebCore/bindings/js/JSXMLHttpRequestCustom.cpp:231
> > + // Once we have parsed the response text and have a cached result, we can clear the
> > + // response text buffer to conserve memory.
> > + impl()->clearResponseTextBuffer();
>
> It's a shame that we have to add so much logic to bindings, and even add multiple accessors just to make that possible. This is against the general direction of abstracting out bindings differences and having logic in DOM code.
I know, I feel like this is such an abomination to the spirit of pure abstraction. I would say that this is an unusual scenario though given that separate VMs are responsible for performing the parsing in the correct context - doing that in the wrapper and not the native object makes sense. But then on top of that, we want the native object to manage itself efficiently only when the said parsing has taken place - which involves a call back.
These are pretty rare situations, but I'd love to explore a better paradigm for handling them.
>
> > Source/WebCore/xml/XMLHttpRequest.cpp:293
> > +String XMLHttpRequest::responseJSONText(ExceptionCode& ec)
>
> As mentioned above, it's unfortunate to do things so much differently in this single case. We don't have responseHTMLDocumentText or responseArrayBufferText.
No we don't, but that's because they are dealing with strong typed DOM objects. If we were caching serialized script values in XHR then we'd just have a resposneJSON instead of responseJSONText...but that alone wouldn't eliminate the boundary issues (parsing takes place in the binding).
If we try to look objectively at the what these accessors (responseArrayBuffer, responseJSONText, responseXML, responseText, responseBlob) are doing, they are 1) returning and/or processing the payload for XYZ responseType, and 2) are validating the state of the XHR instance and setting or throwing exceptions when appropriate. For JSON, the payload happens to be the JSON text, pure and simple. If it were only named responseJSON, I'd fear that would cause confusion and lead one to think it's returning a literal object.
>
> > LayoutTests/ChangeLog:14
> > + * fast/xmlhttprequest/resources/xmlhttprequest-responsetype-json.json: Added.
> > + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-invalid-expected.txt: Added.
> > + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-invalid.html: Added.
> > + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid-expected.txt: Added.
> > + * fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid.html: Added.
>
> You rightfully disobey the spec requirement that UTF-8 should be used regardless of what the server specifies. I'd have added a regression test for that, too.
>
> Payload with a null byte somewhere should ideally be tested, too.
That's a good idea.
>
> > LayoutTests/fast/xmlhttprequest/xmlhttprequest-responsetype-json-valid.html:18
> > + shouldBe('xhrJson.response', 'xhrJson.response');
>
> This doesn't seem to be what the spec says now, even though it's what it should say. A more dramatic way to test for cachedness would be:
>
> xhrJson.response["foo"] = "bar";
> shouldBe("xhrJson.response['foo']", "'bar'");
>
> This is how it works for responseXML in Safari and Firefox at least.
Sure that can be an additional concrete test. True the spec doesn't call for caching, but motivation for Mozilla to push for a "json" responseType was to be able to throw away the response text/payload for memory efficiency...so by not caching the obj, that primary objective is not at all achievable. Perhaps you're right in that the spec should add a quick box of text recommending this to UAs.

(In reply to comment #37)
> If you're going to disobey spec requirements, I sure hope you've discussed that on public-webapps.
The spec asks that the text be decoded using utf8, which we are by default (even if the server specifies a different charset in the MIME). But if the payload is a different Unicode encoding, i.e., utf16 or utf32, we'll properly detect the BOM and decode it gracefully. But any other payload, particularly one without a BOM, the text is decoded as utf8.
The exception to that is if the user specifies a charset when calling overrideMIMEType(), which will override the utf8 default. We still will detect a BOM and seamlessly change the codec, but if there is no BOM the user's charset will be used.
The spec further mentions to remove/discard a utf16 BOM 0xfeff (implying network order), which we will do, but we also will switch to the utf16 codec at that time. I think there was a conversation on public-webapps about encoding and I'll it up and/or follow up regarding this, but if we can handle utf16 and utf32 we might as well keep that feature around in the meantime.

(In reply to comment #38)
> The exception to that is if the user specifies a charset when calling overrideMIMEType()
I left out accidentally (thought it would be implied), but of course the server's content-type w/ a charset applies in the same way here.

Created attachment 210305[details]
Archive of layout-test-results from webkit-ews-11 for mac-mountainlion-wk2
The attached test failures were seen while running run-webkit-tests on the mac-wk2-ews.
Bot: webkit-ews-11 Port: mac-mountainlion-wk2 Platform: Mac OS X 10.8.4

Created attachment 210306[details]
Archive of layout-test-results from webkit-ews-02 for mac-mountainlion
The attached test failures were seen while running run-webkit-tests on the mac-ews.
Bot: webkit-ews-02 Port: mac-mountainlion Platform: Mac OS X 10.8.4