Issues in Layout that do not fit into any other Layout component or which span multiple Layout components.

Bugs related to the top level presentation objects (pres shell, pres context, and document viewer), the frame constructor, and the base frame classes, as well as general issues with alignment and sizing, all belong here.

The first assert I see here is:
###!!! ASSERTION: overflow containers out of order or bad parent: '!(aOverflowCont->GetStateBits() & NS_FRAME_IS_OVERFLOW_CONTAINER)', file ../../../mozilla/layout/generic/nsContainerFrame.cpp, line 1379
This happens a number of times, then I get:
###!!! ASSERTION: loop in frame list. This will probably hang soon.: 'Error', file ../../../mozilla/layout/generic/nsFrameList.cpp, line 587
Then I in fact seem to hang.
fantasai, want to check this out? Is this a regression from something it should be blocking?

I see the same asserts & crash that Jesse reported in Comment 0 -- not the behavior that bz described in comment 1.
Using fresh trunk checkout from today:
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b3pre) Gecko/2007121412 Minefield/3.0b3pre

Created attachment 296439[details]
testcase 2 (crash)
Posting a set of three related testcases for this bug that I made a little while back.
This one (testcase 2) triggers the same assertions & crash as in comment 0 (like the original testcase.)

Created attachment 296442[details]
testcase 3 (assert)
This one is effectively the same as testcase 2 without the "position: absolute". (I also added a -moz-column-gap and background colors, for easier visualization)
This doesn't trigger the crash. Instead, we hit this assertion:
###!!! ASSERTION: frame was not removed from primary frame map before destruction or was readded to map after being removed: 'Not Reached', file /mozilla/layout/base/nsFrameManager.cpp, line 711

Created attachment 296445[details]
testcase 4 (hang)
This testcase is like testcase 2, but...
- with no "position: absolute"
- with different height values
- we don't call the "boom" function (because we don't have to)
It triggers a hang, during which a ridiculously large frame tree is constructed. I'll post a partial frame dump of this at some point.
I think this hang is caused by the same underlying issue, because it has the same regression range.

Also -- AFAICT, testcases 1 and 2 *don't* crash using latest-nightly
Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9b3pre) Gecko/2008011104 Minefield/3.0b3pre
but they *do* crash using a debug build from a checkout today.
I imagine this is probably because of some pointer that gets cleared in debug builds but not in optimized builds, so we hit a null deref in the debug build but not in the optimized build.

Created attachment 297065[details]
testcase 5 (rendering issue)
Testcase 5 shows part of what's going wrong and causing the hang in testcase 4: we're letting the content spill beyond the supposedly-limited number of columns.
Desired Testcase 5 rendering: 3 columns
Actual Testcase 5 rendering: 4 columns
If you make div c taller, you can get arbitrarily many columns in buggy builds. (but it sticks to 3 columns if you back out bug 399407)

Created attachment 297641[details][diff][review]
patch
Here you go. This reverts the patch for bug 399407 and replaces it with a switch from GetSize() to GetOverflowRect() in the balancing code. AFAICT it passes all testcases here and in 399407 -- no crashes, hangs, or asserts. Awesome job with the investigation, dholbert -- writing this was easy given all your analysis!

(In reply to comment #17)
> Awesome job with the investigation, dholbert -- writing this was easy given all
> your analysis!
Great, I'm glad it helped and that you could patch it! I was getting a bit overwhelmed trying to wrap my mind around the inner workings of -moz-col. :)
> AFAICT it
> passes all testcases here and in 399407 -- no crashes, hangs, or asserts.
It'd be good to check if it fixes bug 408737, as well -- I think the testcases there are similar to testcase 4 (hang) on this bug.

If it's an onmousedown dynamic switch between top: 100px; and something else, then I can see why we wouldn't want to rebalance the columns. But in most cases I think we do want to take overflow into account, and I don't think it makes sense to make an exception for this case.

If you have content where the last line always causes overflow (due to relative positioning or something as simple as an unusually large descender glyph), then our balancing algorithm is going to always see overflow, conclude that the content never fits, and therefore try to reduce the column height to approximately zero. That doesn't sound good.

That's no good because then we're not detecting when the in-flow content doesn't fit in the available height.
What we really need here is some new reflow metric, the max of the YMosts of all blocks whose block formatting context is the column, or something like that.

Ok, so, how about
PR_MAX(mRect.height, PR_MIN(availableHeight, overflow.height));
? I *think* that should solve this bug without breaking the overflowing content detection insofar as it already exists.
It won't handle detecting in-flow content that doesn't fit if that content is in the overflow: we'll need to add a ymost metric to the frames for that.

Because it will have to recurse into descendants to check floats and overflow containers, that's not going to be a simple computation. You could maybe run it after a tentative balancing height has been computed.

It's not complex I think, but it might be slow-ish. We could run it only when the column's overflow-YMost is greater than its border-box height, because we know the "max of the YMosts of all blocks whose block formatting context is the column" must be less than or equal to the overflow-YMost.

Created attachment 302926[details]
IRC log of discussion
roc and I talked about some of the more complicated issues we need to work around here (like needing to undo relative positioning) and came to the conclusion that we're best off storing ymost data on the frames. The idea is to make it part of the overflowRect struct and only store it when it doesn't match mRect.height.

roc, storing the overflowRect and contentBottom in one datastruct is a mess. I'm going to keep them separate and just share the flag. Since contentBottom usually exists only when the overflowRect != mRect, GetOverflowRect will get very few false positives on whether it needs to retrieve data from the proptable; GetContentBottom will have more, but it's only checked about once per frame during reflow.
The patch for this seems to fix this bug, the crash in bug 408602, bug 408737, and bug 414255. It'll need a slew of reftests, though.

Still work in progress. The patch is a mess; I'm cleaning up the mess. :)
mContentBottom was intentional at that point: I used it so I had reliable storage while I fixed the calculations. The final patch will optimize the storage by using the proptable.

+ nsHTMLReflowMetrics* aMetrics = nsnull);
This needs documentation --- and a better name --- to make it clear this is the *container's* metrics into which child overflow should be accumulated.
+ if (line->IsBlock())
+ aMetrics.MergeContentBottom(line->mFirstChild->GetContentBottom());
+ else
+ aMetrics.MergeContentBottom(line->mBounds.YMost());
+ }
Slightly better if you set "nscoord bottom = line->IsBlock() ? ... : ...; aMetrics.MergeContentBottom(...);"
+ mFrame->SetContentBottom(mFrame->GetContentBottom() - y);
What is going on here? We should avoid getting block's y-offset into its bottom-value in the first place. Seems like we should be adding the frame-y in a lot of MergeContentBottom calls. I'm confused.
I'm not sure how you're handling relatively positioned blocks, but it's tied up with the previous issue.
+ * Returns a y coordinate relative to this frame's origin that represents
+ * the logical bottom of the frame or its visible content, whichever is lower.
+ * Relative positioning is ignored and margins and glyph bounds are not
+ * considered.
I think you need to be more precise here about what "visible content" means.
Do we really have to include abs-pos frames in the bottom calculation? That seems to lead to problems where the abs-pos frame is positioned using 'bottom' to be overflowing the bottom of the container, no matter what the container's height is.
+ void SetContentBottom(nscoord aYMost)
+ {
+ if (&height == mContentBottom)
+ mContentBottom = new nscoord;
+ *mContentBottom = aYMost;
+ }
Avoid the dynamic allocation, either use a spare nscoord allocated in the nsHTMLReflowMetrics itself, or better still just use a flag to tell whether the value is the height or not. Although actually I don't understand the need for this, why not just set mContentBottom to zero initially and have GetContentBottom return PR_MAX(height, mContentBottom)?
BIG QUESTION:
We store the before-relative-positioning offset of blocks in nsGkAtoms::computedOffsetProperty for the frame. If we did the same for relatively positioned inlines (pretty easy to do), would the original "compute bottom value by scanning the frame tree" work reasonably simply? Because this approach is adding quite a lot of code.

> What is going on here?
"Undoing" relative positioning. You're right, a lot of MergeContentBottom calls are missing the y value.
> I think you need to be more precise here about what "visible content" means.
Not clipped. I'll clarify that.
> Do we really have to include abs-pos frames in the bottom calculation?
Yes, because abspos content generates overflow columns if it doesn't fit.
> problems where the abs-pos frame is positioned using 'bottom' to be
> overflowing the bottom of the container, no matter what the container's
> height is.
We shouldn't be including abspos frames for which the colset is the containing block. Looks like that's not something I'm paying enough attention to...
> why not just set mContentBottom to zero initially and have
> GetContentBottom return PR_MAX(height, mContentBottom)?
Undoing relative positioning can cause it to be less than the height. I could have GetContentBottom() check nsGkAtoms::computedOffsetProperty if you want to avoid the height/mContentBottom gymnastics.
> If ... would the original "compute bottom value by scanning the frame tree"
> work reasonably simply?
Dunno. I'll try it and we'll see. BTW, computedOffsetProperty stores the relpos offsets, not the original position, right?

It stores the offsets, right.
You'd want to create a helper method that gives you the frame origin ignoring relative positioning, that you can call from your code and from nsBlockReflowState::FlowAndPlaceFloat.

Created attachment 314819[details][diff][review]
patch
> Why not just unconditionally look for the property? Or at least, if you want
> nsBlockReflowState to not suffer, take nsStyleDisplay as a parameter.
Done.
> Use GetAsBlock, there are areaFrames and table captions that fail this check
Done.
> I don't see any code that is setting computedOffsetProperty for rel-pos
> inline frames?
They wouldn't be used for this patch, but I shifted the rel-pos storing code
to nsHTMLReflowState anyway.
> Also you forgot to include your tests in the patch.
Added.
> You need to exclude the overflowOutOfFlows list here too
Done.
> This can't be right, the first line includes aOffset but the overflow rect
> doesn't. Why are we taking offset as a parameter here and always adding it,
> anyway?
Because I'm confused. :/ Removed.
> Avoid calling CalculateContentBottom(child) twice?
Added caching within ReflowChildren and between ReflowChildren and Reflow.
I need to do some more testing (reftests pass), I'll flag you when I'm done. Do we have good reftests for column balancing?