A simplification. The current model of who owns JSScripts makes it a pain for the debugger to hold references to them.
#ifdef DEBUG_jorendorff
It also seems error-prone to me. I don't know why this code is safe:
script = Compiler::compileScript(...);
if (!script)
return NULL;
JSObject *scriptObj = js_NewScriptObject(cx, script);
js_NewScriptObject can perform last-ditch GC. Atoms are never collected
during a last-ditch GC, but what about proto-functions and scope objects
in the script?
#endif
Currently, JS_Evaluate*Script* and eval directly call js_DestroyScript. What if we dropped that? All the atoms, scope objects, functions, function scripts, and other things allocated by compilation are garbage anyway. If necessary, we can still blow away any jit code held by the script as we exit the frame.

(In reply to comment #0)
> A simplification. The current model of who owns JSScripts makes it a pain
> for the debugger to hold references to them.
So what about always allocating JSScript object and moving most fields there for faster access with a variable-size bytecode and other arrays allocated on the heap?

(In reply to comment #2)
> If you make it finalizable in the background there shouldn't be any
> additional GC pause time regression.
Finalizing scripts i background would be nice as it would also allow to finalize functions in background, but there are a lot of code in DestroyScript that is single-threaded, like those jit-related release methods. Some of it I guess can be replaced with a sweeping loop over all existing scripts that releases things that refers to soon-to-be-finalized scripts, but that requires extra complexity in the code. So lets turn JSScript into GC things first and then see how it would affect the GC pause.

(In reply to comment #3)
> (In reply to comment #2)
> > If you make it finalizable in the background there shouldn't be any
> > additional GC pause time regression.
>
> Finalizing scripts i background would be nice as it would also allow to
> finalize functions in background, but there are a lot of code in
> DestroyScript that is single-threaded, like those jit-related release
> methods. Some of it I guess can be replaced with a sweeping loop over all
> existing scripts that releases things that refers to soon-to-be-finalized
> scripts, but that requires extra complexity in the code. So lets turn
> JSScript into GC things first and then see how it would affect the GC pause.
With type inference enabled, all method JIT code is now released on every GC, so DestroyScript should not have to consider issues with releasing memory from executable allocators. This does not hold with TI disabled, though (to do this we need to be able to throw all active frames into the interpreter during GC, which we can't do with the base method JIT since it does not fully sync the stack).
This bug would help TI a lot, as the current fact that DestroyScript can be called on scripts outside of GC means we can't analyze types in such scripts, and can't compile them (type constraints generated on a script will refer to that script's contents, and cannot be freed until the next GC).

Just a note about the DEBUG_jorendorff aside:
js_NewScriptObject creates an AutoScriptRooter on the script you pass in. So it looks like this is actually non-buggy. However, we clearly need a less hacky way to deal with this because it's quite bug-prone.

Created attachment 551541[details][diff][review]
v1
The patch implements that idea of splitting JSScript into fixed-sized JSScript part allocated on the GC heap and the variable-length part allocated using malloc. When we will be able to allocate variable-length GC things, this split should be reconsidered, but for now this solves rooting issues as the embedding can use normal rooting API and rely on the conservative stack scanner. It also allows to eliminate JSObject wrappers for scripts that are currently created to root scripts, but this is for another bug as this would require changing JS API.
With the patch I preserved calling script destroy hooks when it is known that the script is no longer used, is this necessary in view of the debugger changes?

(In reply to Jason Orendorff [:jorendorff] from comment #10)
> Igor, I have a bunch of related work in bug 665167 that's partly reviewed.
> Could you take a look?
Well, with this patch we could store eval scripts in the weak hash directly and there would be no need to create any script objects.

Bill, don't wait. This "conflicts" with jsdbg2, but in a good way: I think the merge will involve simplifying the Debugger.Script stuff in jsdbg2 considerably.
I'll finish reviewing this tomorrow. I'm only about 30% done now.

Comment on attachment 551757[details][diff][review]
v2
Review of attachment 551757[details][diff][review]:
-----------------------------------------------------------------
This looks great to me! It looks like sometimes you need to call the DestroyScriptHook eagerly, and other times you wait until finalization. Hopefully Jason understands that better than me; I didn't check it too carefully.
::: js/src/jsfun.cpp
@@ +1692,3 @@
> */
> + if (fun != obj && fun->isFlatClosure())
> + Foreground::free_(obj->getFlatClosureUpvarsOrNull());
getFlatClosureUpvarsOrNull seems kinda hacky to me. Are there any bits left in JSFunction::flags that we could use to store whether there are upvars?
Either way, I would prefer to have a hasFlatClosureUpvars test rather than the OrNull thing. I think it might make the isDouble test (or the flag check) a little easier to understand.
::: js/src/jsgcmark.cpp
@@ +839,5 @@
> + JSConstArray *constarray = script->consts();
> + MarkValueRange(trc, constarray->length, constarray->vector, "consts");
> + }
> +
> + if (script->u.object) {
Maybe put a comment here about the eval cache being cleared first, so that you never get the evalCacheLink member out.
::: js/src/jsparse.cpp
@@ +180,5 @@
> pn_arity = PN_NULLARY;
> pn_parens = false;
> }
>
> +Parser::Parser(JSContext *cx, StackFrame *cfp, bool foldConstants)
I don't understand the principals changes. Are they related to this patch, or just a cleanup?
::: js/src/jsscript.cpp
@@ +1333,1 @@
> gsnCache->purge();
We already purge all GSN caches earlier in the GC (through ThreadData::purge), so we should be able to remove this.

Created attachment 552334[details][diff][review]
v3
The new version addresses the comments above (including removal of unrelated principal chagnes) and fixes one corner case that I missed when enabling background finalization of closures and JSFunction instances.
Currently fun_finalize uses getFunctionPrivate() to access JSFunction * referenced by the closure. But that implies that the closure must always be finalized before JSFunction instances. This is ensured via the finalization ordering in GC.
The issue with the background finalization of functions is that currently we add arena lists to be finalized using fallible Vector::append and do immediate finalization if that fails. When that happens when we try to add JSFunction arenas, then we will finalize them before the closures.
The patch fixes that via reserving the space in the background finalization vector so the code can use Vector::infallibleAppend ensuring the finalization order. JSObject::finalizeUpvarsIfFlatClosure() that replaced getFlatClosureUpvarsOrNull() from the previous patch contains detailed comments about this.

Thanks for the numbers! Looks great!
Did you disable the Arena::finalize(JSContext *cx) memset instrumentation when you collected the numbers?
if (!newFreeSpanStart)
newFreeSpanStart = thing;
t->finalize(cx);
memset(t, JS_FREE_PATTERN, sizeof(T));
The sweeping cost is heavily influenced by this memset (at least on Mac).
I remember that the GC triggers are very tight so that we don't perform any GC during sunspider. I don't think this patch will change it but we should check it at least.

(In reply to Gregor Wagner from comment #22)
> Did you disable the Arena::finalize(JSContext *cx) memset instrumentation
> when you collected the numbers?
>
> if (!newFreeSpanStart)
> newFreeSpanStart = thing;
> t->finalize(cx);
> memset(t, JS_FREE_PATTERN, sizeof(T));
No, I have keep that as is.
> I remember that the GC triggers are very tight so that we don't perform any
> GC during sunspider. I don't think this patch will change it but we should
> check it at least.
I see no differences in SS.

Created attachment 553008[details][diff][review]
v4
The new version is a rebase on top of jsdbg2 changes. The patch does not touch the code that creates JSScript object holders. That is for bug 678830.
The most ugly part of merge was the need to add gcpad and gcpad2 to JSScript to ensure that sizeof(JSSccrpt) is a multiple of 8 bytes on 32-bit platform. Initially I tried to hide that with some helper classes, but those just add a few lines of code just to hide memory layout inefficiency.

Regarding the memset to 0xdb, I think it makes sense to poison script->data eventually (using the new JS_POISON). For now, I'd rather we just poison the script itself, since it relates to some stuff I'm trying to figure out in bug 670072.

(In reply to Jason Orendorff [:jorendorff] from comment #27)
> This patch continues calling the destroyScript hook in the exact same
> places it was being called before, except in some of the error paths.
> That's a good decision, I think, but do you know of any reason why we
> couldn't later change it, so we just fire the event when the script
> actually gets destroyed?
I wrote it this way to keep the eval cache transparent for the debugger as it is currently now. I do not know reasons why we did it in the first place.
> The GC will blit JS_FREE_PATTERN over the JSScript itself, right? If
> not, that's a showstopper for me.
Yes, the finalization code uses JS_POISON after calling the finalizer.

Graph confirms this patch was the culprit. I'm surprised it affects only Mac though.
If you meetup and decide the regression is ignorable, please state so when pushing next time, so we know it should be ignored.

Given that the regression is mac-specific where we do not use jemalloc a possible explanation would be the increased number of small allocations for script bytecode. Previously they were allocated together with bigger script, but with the patch they are done separately. So while the patch replaced 220_or_so_bytes_of_script + bytecode_memory by just bytecode_memory. That may be allocated in a different malloc zone and interfered with TDHTML.
To check I try a patch with small inline-buffer for bytecode.

I was doing some measurements yesterday and saw that the most common script-data size was 2 bytes; IIRC roughly 25% of all scripts had that. jemalloc is able to satisfy a 2-byte request without rounding up (likewise for 4-byte requests), but the Mac allocator rounds up to 8 or 16 (not sure which).
In your patch it looks like you are adding 40 or 44 bytes of inline storage, I think that's much more than necessary. Some more detailed measurement would be informative.
An alternative to inline storage would be to work out why 2 bytes is such a common case, and see if it can be reduced to zero.

(In reply to Nicholas Nethercote [:njn] from comment #35)
> I was doing some measurements yesterday and saw that the most common
> script-data size was 2 bytes; IIRC roughly 25% of all scripts had that.
We creates an empty script per compartment per each Function.prototype instance, that is once per global. As the script bytecode must be mutable to allow its debugging this empty scripts cannot be shared. I am surprised that this is that common.
> jemalloc is able to satisfy a 2-byte request without rounding up (likewise
> for 4-byte requests), but the Mac allocator rounds up to 8 or 16 (not sure
> which).
The inline data storage has not helped the benchmarks. My another theory is that on Mac small allocation sizes may not be 8-byte aligned unless the allocation size is itself is a multiple of 8. That could lead to misaligned access of jsval stored in the script data.
> In your patch it looks like you are adding 40 or 44 bytes of inline storage,
> I think that's much more than necessary. Some more detailed measurement
> would be informative.
Well, the JSScript size is 224 bytes on 64 bit without the inline buffer and on 32 bit it is around 140. So 40 bytes is not that big given. On another hand it allows to fit may small scripts like return Math.sin(this.x). So I have chosen it to investigate the performance implication.

> Well, the JSScript size is 224 bytes on 64 bit without the inline buffer and
> on 32 bit it is around 140. So 40 bytes is not that big given. On another
> hand it allows to fit may small scripts like return Math.sin(this.x). So I
> have chosen it to investigate the performance implication.
Well, measurements would be good. We get a *lot* of JSScripts in practice.

To test the amount of scripts and their data sizes I patched NewScript from jsscript.cpp to print script size and then run http://gregor-wagner.com/tmp/mem that opens 150+ new tabs. From the browser startup until that test is loaded about 400_000 scripts where created and only about 9300 were empty. Moreover, there were just 3,6% of scripts with size not exceeding 8 and the ratio of scripts that exceeded 64 bytes where 70%.
As such using en extra inline buffer that is unused if the allocation cannot fit it just waste memory unless it significantly offsets the fragmentation. As far as I can see the memory allocation dominates by other GC things so I assume that effect is negligible.
This is sort of confirmed with the try server runs where different inline sizes and data alignments have no influence on the benchmarks. Which still opens the question of what exactly causes the regression on Mac.

Created attachment 554635[details][diff][review]
inline buffer v2
This patch uses the extra 4-byte pad that is necessary on 32 bits to make sizeof(JSScript) a multiple of gc::Cell::CellSize to store inlined empty and similar scripts. It also reorganizes JSScript fields to use more efficient memory layout on 64 bit CPU. That shrinks sizeof(JSScript) on those platforms from 200 down to 184 bytes.
Another change is that now the const jsval array is placed at the start of the memory allocation for the data array. It eliminates the need for extra padding that is necessary without the patch.
As I wrote in the previous comment, this still shows the same 5% TDHTML regression on OS X. Is simplification to the debugger and script management that this bug provides justifies that platform-specific regression? Or should more efforts be spent to avoid it?

From the try server run I noticed that the regression became slightly smaller on OSX 64-bit with the patch from the comment 39. As that patch reduces the size of JSScript on 64 bits, it means that for some reason TDHTML on OSX is very sensitive to the JS heap size increase due to JSScript allocations moved from the malloc heap.
The tests are structured like:
function foo() {
do_something();
setTimeout("f()", 0);
}
This implies that a new function is compiled on each iteration so the test creates a few of JSScript instances corresponding to those functions. So a possible explanation for the regression is an extra GC that this triggers. The reason why this has not been observed on other OS could be that our the JS heap size before the tests and GC heuristics depends on the OS and CPU speed.

(In reply to Igor Bukanov from comment #41)
> So a possible explanation for the regression is an extra GC that this triggers.
> The reason why this has not been observed on other OS could be that our the
> JS heap size before the tests and GC heuristics depends on the OS and CPU
> speed.
That sounds plausible. How sure are you that the inline buffer will help? Do you still want review for that? Or do you plan to retune the GC heuristics (or try some other tack) instead?

(In reply to Jason Orendorff [:jorendorff] from comment #44)
> That sounds plausible. How sure are you that the inline buffer will help?
According to the try server the inline buffer has not helped.
> Do
> you still want review for that?
Yes - the changes saves 4 bytes on average from the malloc data that are allocated fro script, minimize JSScript size on 64 bit CPU and avoids the the need to allocate the inline buffer for empty scripts on 32 bit using the pad word that is necessary to make sizeof(JSScript) devidable by 8. This makes the regression slightly smaller.
> Or do you plan to retune the GC heuristics
> (or try some other tack) instead?
I plan to retune the GC after the bug 665007, for now my proposal is to accept the regression.

(In reply to Igor Bukanov from comment #45)
> I plan to retune the GC after the bug 665007, for now my proposal is to
> accept the regression.
If you can confirm that the cause of regression is a greater number of GC cycles on Mac, that sounds fine to me.

(In reply to Jason Orendorff [:jorendorff] from comment #47)
> If you can confirm that the cause of regression is a greater number of GC
> cycles on Mac, that sounds fine to me.
I can see that the benchmark is clearly influenced by the GC params. When I made the last-ditch GC to run less likely, the regression *increased* by another 4-5%. But I could not remove the regression. When I tried to make the GC ti run to more frequently, the benchmark time dropped by 0.5%, but that could be due to noise.
Does it sound like a confirmation?

(In reply to Igor Bukanov from comment #50)
> (In reply to Jason Orendorff [:jorendorff] from comment #47)
> > If you can confirm that the cause of regression is a greater number of GC
> > cycles on Mac, that sounds fine to me.
>
> I can see that the benchmark is clearly influenced by the GC params. When I
> made the last-ditch GC to run less likely, the regression *increased* by
> another 4-5%. But I could not remove the regression. When I tried to make
> the GC ti run to more frequently, the benchmark time dropped by 0.5%, but
> that could be due to noise.
>
> Does it sound like a confirmation?
Well, no. There's no data there about the number of GC cycles. And the results are somewhat unexpected! I don't know what to think about that, except that it doesn't seem consistent with the proposition that we can make up the regression by tuning GC heuristics. It certainly doesn't support the hypothesis in comment 41:
> So a possible explanation for the regression is an extra GC that this triggers.
> The reason why this has not been observed on other OS could be that our the JS
> heap size before the tests and GC heuristics depends on the OS and CPU speed.
The more I think about that, the more skeptical I am, but I could still be convinced. If that hypothesis is correct, then the number of GC cycles that occur during this benchmark should look something like this:
m-c tip m-c tip plus the patch
| |
Linux 4 4 No extra GCs, no regression
Mac 4 6 Extra GCs caused the regression
If you run with MOZ_GCTIMER, you could also confirm that the absolute amount of the regression is very close to the duration of the extra GC cycles. That would be strong confirmation. (You could omit the Linux runs, and I might still be convinced if the numbers came out right.)
An alternative would be to run the benchmark, with and without the patch, in a Shark-enabled build and look at the profiles.
I want this patch in, but a 5% regression is a big deal. We can't blow it off.

Comment on attachment 554635[details][diff][review]
inline buffer v2
r=me as far as the code is concerned, but of course the regression still needs to be fixed.
In jsscript.cpp, JSScript::NewScript:
>+ * To esnure jsval alignment for the const array we calculate the size
Oh, the typo in this patch. Sorry, I thought I was looking at the first patch.
>+ /* Constants must go before the data pointer to ensure jsval alignment. */
>+ if (nconsts != 0) {
>+ JS_ASSERT(reinterpret_cast<jsuword>(data) % sizeof(jsval) == 0);
>+ script->consts()->length = nconsts;
>+ script->consts()->vector = reinterpret_cast<Value *>(data - constSize);
>+ }
I don't understand. This seems like just as big a hack as constPadding, maybe
even more egregious. What was wrong with constPadding? Could we just put the
constant array first?
The changes to Snarf in js.cpp seem unrelated.

(In reply to Jason Orendorff [:jorendorff] from comment #53)
> >+ /* Constants must go before the data pointer to ensure jsval alignment. */
> >+ if (nconsts != 0) {
> >+ JS_ASSERT(reinterpret_cast<jsuword>(data) % sizeof(jsval) == 0);
> >+ script->consts()->length = nconsts;
> >+ script->consts()->vector = reinterpret_cast<Value *>(data - constSize);
> >+ }
>
> I don't understand. This seems like just as big a hack as constPadding, maybe
> even more egregious. What was wrong with constPadding?
constPadding leads to loosing 4 bytes on average per each JSScript instance with const values. That was a consequence of deviating from the initial layout of the script when the array and structures with bigger alignment requirements came first eliminating any need for padding. The patch restores that.
>Could we just put the constant array first?
Our offsets to get various JSObjectArray-like structures must stay within 254 bytes from the data pointer. Putting the const array in front can break that. To avoid that the patch makes the data pointer to point into the middle of allocation where those structures starts.

(In reply to Jason Orendorff [:jorendorff] from comment #51)
> Well, no. There's no data there about the number of GC cycles. And the
> results are somewhat unexpected! I don't know what to think about that,
> except that it doesn't seem consistent with the proposition that we can make
> up the regression by tuning GC heuristics. It certainly doesn't support the
> hypothesis in comment 41:
>
> > So a possible explanation for the regression is an extra GC that this triggers.
> > The reason why this has not been observed on other OS could be that our the JS
> > heap size before the tests and GC heuristics depends on the OS and CPU speed.
The tests are structures according:
function f() {
do_something_and_return_when_done();
setTimeout("f()", 0);
}
window.onload = f;
where do_something_and_return_when_done() does light dom manipulations.
As setTimeout is called with a string argument, on each timer evaluation we evaluate the string. Without the patch the corresponding script is destroyed immediately but with the patch we must wait for the GC. Now, as the minimal timeout time is 4ms and we run the test for 300 seconds, we effectively creates about 300 / 0.004 or 75_000 scripts. With each script taking about 190 bytes (on 64 bit CPU) that translates into 15MB of data to allocate with the GC. The regression is on scale of 10s so we have 1.5MB per second of the regression. As the GC allocation and mark-and-sweep is at least 100 faster, the regression cannot come from the extra GC overhead and Windows/Linux data confirms that.
So indeed my initial hypothesis regarding the increased number of GC cycles is wrong, we do not have enough allocations to influence that. At worst we would get e
an extra GC cycle and that would take max 1s in the absolutely worst case given the rest of test structure. I wish I calculated this before starting to speculate about the regression origin.
I will try to experiment with disabling the DestroyScript call effectively leaking the script and see how that would compare with the patch numbers.

(In reply to Igor Bukanov from comment #54)
> constPadding leads to loosing 4 bytes on average per each JSScript instance
> with const values.
2 bytes on average, right? And only on 32-bit platforms?
> >Could we just put the constant array first?
>
> Our offsets to get various JSObjectArray-like structures must stay within
> 254 bytes from the data pointer. Putting the const array in front can break
> that.
I didn't mean at data[0]. I just mean before the other array data. It can go after the JSObjectArray-like things. Those all have sizes that are multiples of 8 bytes, right?

(In reply to Jason Orendorff [:jorendorff] from comment #56)
> (In reply to Igor Bukanov from comment #54)
> > constPadding leads to loosing 4 bytes on average per each JSScript instance
> > with const values.
>
> 2 bytes on average, right? And only on 32-bit platforms?
sizeof(jsval) == 8 so the waste is on average 4 bytes on 32 bit platforms.
> I didn't mean at data[0]. I just mean before the other array data. It can go
> after the JSObjectArray-like things. Those all have sizes that are multiples
> of 8 bytes, right?
You are right, we can put the const vector right after the data array and enforce with static asserts their sizes. I will update the patch with that.

Updates on the TDHTML regression. When I commented out js_DestroyScript in EvaluateUCScriptForPrincipalsCommon effectively leaking all evaluate scripts I got the same type of regression that is observed with the patch. That is, with the patch the regression is the time increase from 310 to 320, but with the leaking scripts the timing went from 310 to 322. So the issue here is that for some reason the tests on OS X are very sensitive to the number of allocated script instances.
I also tried to schedule the GC after each 1000 created scripts to minimize the memory held by scripts, but that did not help. Another option that I considered is to recognize in setTimeout implementation in dom/base/nsGlobalWindow.cpp a pattern like setTimeout("name()") and replace EvaluateScript there with CallFunctionByName. However, that regressed the benchmarks even more for some reasons on MAC OSX while slightly helping on Windows/Linux.

For TI all script destruction had to move into the GC --- analysis information can freely refer to scripts, and is only purged on GC. So it made a similar change to the pain point you've identified for this test, with a similar perf change.
I think it's pretty clear from comment 58 that this regression is due to an artificial feature of this test; the patch should just go in without any TDHTML-specific tweaking.

The "artificial feature of the test" is actually a very common way for people to write code on the web, if you mean passing a string argument to setTimeout. And at one point I seem to recall we just had algorithms in the JS engine that linearly walked all live JSScripts or something. Have we gotten rid of those?
And we still have no idea why this is a problem on Mac and not on Linux/Windows, right? (Or rather less of a problem on Linux/Windows; there's a regression from TI on those too.)
One thing I should note: one particular test (test_scrolling) accounts for most of the regression from the TI landing on Tdhtml. Does that tell us anything?

Another thought.... would a cache similar to the eval cache allow us to not allocate as many JSScripts here and sidestep the problem? I seem to recall there being talk about maybe getting to it sometime, but if JSScript allocations are getting more expensive then maybe we should look into it more urgently.
In any case, we should still check for algorithms that walk live scripts.

(In reply to Boris Zbarsky (:bz) from comment #63)
> Another thought.... would a cache similar to the eval cache allow us to not
> allocate as many JSScripts here and sidestep the problem? I seem to recall
> there being talk about maybe getting to it sometime, but if JSScript
> allocations are getting more expensive then maybe we should look into it
> more urgently.
We do not need to cache JSScript to help with the benchmark, it is sufficient to recognize a common pattern like setTimeout("f()", t) and turn that into a call to f. However, when I tried to hack that it fact regressed the benchmarks even more. I have no idea why.

This bug is currently blocking bug 676732 -- both bugs modify some of the memory reporters in xpcjsruntime.cpp so there are conflicts. I fixed the conflicts when this patch originally landed, and I didn't want to re-fix them in reverse when the patch went out, so I was planning to wait until this landed again. But it sounds like that might not happen soon, so maybe I should re-fix and land bug 676732?

Created attachment 557107[details][diff][review]
v6
The new patch is merge with infer changes plus the 3 following changes:
1. JSScript are reorganized to get denser packing and to ensure 8-byte alignment in view of extra INFER-related fields.
2. I have added JSGCTraceKind enum and have used it in internal API that takes GC tracing kind. This way GCC warns about switches that have missing case and allowed to fix couple of places where I have missed to add JSTRACE_SCRIPT.
3. The patch removes JSScript::links and instead uses GC cell enumeration code to loop over all scripts in a copmartment. To make those iterations more convenient I have added iterator classes like CellIter and CellIterUnderGC in addition to ForEachArenaAndCell. These classes allows to keep the body of the current script iteration loops intact and add extra checks to ensure that no GC or allocation is possible when iterating over the GC things outside the GC.
These changes are in jsapi.*, jscompartment.cpp, jsgc*, jsacript.* . The rest of files are either unchanged from v5 or contains trivial extra changes like turning script->compartment into script->compartment().

Created attachment 557114[details][diff][review]
v7
The new version fixes the static assert fail about sizeof(JSScript) % CellSize == 0 on 32 bit CPU in optimized builds. I have missed that JSScript::id is debug-only. Now it should compile again.

Comment on attachment 557114[details][diff][review]
v7
Review of attachment 557114[details][diff][review]:
-----------------------------------------------------------------
::: js/src/jsgcinlines.h
@@ +365,5 @@
> +#endif
> + if (lists)
> + lists->clearInArenas(thingKind);
> + }
> +};
CellIter looks great. I think ForEachArenaAndCell can be removed entirely (maybe with an ArenaIter), though no need to do that now.
::: js/src/jsgcmark.cpp
@@ +864,5 @@
> + script->jitNormal->trace(trc);
> + if (script->jitCtor)
> + script->jitCtor->trace(trc);
> +#endif
> +}
It might be worth adding a ScanScript variant of this function; doing this was helpful for type objects, and there are more scripts created than types. This would need to unpack Bindings::trace, which is a MarkShape wrapper.
::: js/src/jsscript.h
@@ +560,5 @@
> * - Temporary scripts created by obj_eval, JS_EvaluateScript, and
> * similar functions never have these objects; such scripts are
> * explicitly destroyed by the code that created them.
> */
> JSObject *object;
Can't script->u.object and ScriptClass be removed entirely now that the script's lifetime is managed by the GC? Or is this intended as a followup (not sure what dependencies other parts of the code base e.g. debugger have on script objects existing).

(In reply to Andrew McCreight [:mccr8] from comment #71)
> Igor: are you planning to land this soon? If so, I'll just hold off on
> landing Bug 683256, as you are fixing that here anyways, and that will save
> a little bit of rebasing.
I plan to land it in couple of hours if the try server is green.

Some of the JS memory reporters were screwed up by this change -- we ended up with two "shapes" reporters, one under "gc-heap" and one not. And "type-objects" were labelled as "shapes". I will fix these when I land bug 676732 tomorrow.

(In reply to Nicholas Nethercote [:njn] from comment #75)
> Some of the JS memory reporters were screwed up by this change -- we ended
> up with two "shapes" reporters, one under "gc-heap" and one not. And
> "type-objects" were labelled as "shapes".
Sorry for not checking that *carefully* in about:memory output.

(In reply to clever from comment #79)
> looks like this patch has broken the js and xpcshells
> it seems to be finalizing a null JSScript*
I cannot reproduce the bug locally. Could you run this under valgrind?