On 16/12/2010 9:11 AM, Aharon (Vladimir) Lanin wrote:
Adding my 2 cents worth. I slowly understanding the concept of
bi-directionally. I have trouble since I can only read and write English.
> Currently, the CSS Writing Modes Module Level 3 spec on text
> direction<http://dev.w3.org/csswg/css3-writing-modes/#text-direction>
> states:
>
> "User agents that support bidirectional text must apply the Unicode
> bidirectional algorithm to every sequence of inline boxes uninterrupted by a
> forced (bidi class B) line break or block boundary.
I think this is referring to a class B line break (whatever that is).
<br/> seem to come at 3.4 (Reordering Resolved Levels) [1] and what is
called Paragraph separators.
> This sequence forms the
> "paragraph" unit in the bidirectional algorithm. The paragraph embedding
> level is set according to the value of the ‘direction’ property of the
> containing block rather than by the heuristic given in steps P2 and P3 of
> the Unicode algorithm."
>
> Further down in the same major section, the definition of
> unicode-bidi:plaintext<http://dev.w3.org/csswg/css3-writing-modes/#unicode-bidi>
> states:
>
> "For the purposes of the Unicode bidirectional algorithm, the base
> directionality of each "paragraph" for which the element is the containing
> block element is determined not by the element's computed ‘direction’ as
> usual, but by following rules P1, P2, and P3 of the Unicode bidirectional
> algorithm."
Above I see "which the element." I have know idea what element is being
referred to here. This paragraph also seems to suggest an added meaning
of a containing block. What is a containing block element?
> I think that these parts of the spec needs to be tweaked in several
> respects:
>
> 1. There is no reason to mention rule P1 when describing how
> unicode-bidi:plaintext affects the base directionality of each paragraph. P1
> deals with how the text is split up into paragraphs, not with the direction
> of each paragraph, and applies to all content, regardless
> of unicode-bidi:plaintext.
>
> 2. I think it would improve clarity to mention the unicode-bidi:plaintext
> exception when first describing how the paragraph embedding level is set
> (first quote above). Thus, the last sentence of the first quote should read:
>
> "The paragraph embedding level is set according to the value of the
> ‘direction’ property of the containing block, unless the containing block
> element has unicode-bidi:plaintext, in which case it is set according to the
> heuristic given in steps P2 and P3 of the Unicode algorithm."
>
> 3. We must probably explicitly define the effect of a paragraph break (i.e.
> a block boundary or bidi class B line break, which in HTML5 includes<br>)
> when the path from the containing block element to the paragraph break
> includes elements with a unicode-bidi value other than "normal". For
> example, what happens when we have (as usual, uppercase English is used
> instead of RTL characters) :
>
> <div dir=ltr>
> <span dir=rtl>
> TO BE<br>
> OR NOT TO BE?
> </span>
> -- hamlet, in rtl translation.
> </div>
>
> Should the "OR NOT TO BE?" be displayed in rtl ("?EB OT TON RO") or in ltr
> ("EB OT TON RO?")?
That believe this depends on the value of unicode-bidi. I am somewhat
confused myself since the default behavior in an offline test,
<!DOCTYPE html>
<div dir=ltr>
<span dir=rtl>
TO BE<br>
OR NOT TO BE?
</span>
<div>-- hamlet, in rtl translation.</div>
</div>
in FF 3.6.13 renders as embed where the initial value for unicode-bidi
is normal.
unicode-bidi: embed, isolate and plaintext produces this.
?OR NOT TO BE
unicode-bidi: normal produces this.
OR NOT TO BE?
unicode-bidi: bidi-override produces this.
?EB OT TON RO
I have not tested in other browser since I am ignorant if FF even does
it correctly.
> While it seems obvious that it should be displayed in RTL because it is part
> of a<span dir=rtl>, that is not the result if we simply translate the above
> into Unicode bidi formatting characters, i.e.
>
> [RLE]TO BE
> OR NOT TO BE?[PDF] -- hamlet, in rtl translation.
The direction does not affect the embedding algorithm of a particular
script. The direction changes where the start and end is for a sequence
of inline boxes. The placement of punctuation marks (.,;?!`), makers for
list (with value of outside) is changed due to direction.
> The overall direction of both paragraphs is ltr (P2 and P3 are overridden),
> and since the paragraph break resets all embedding levels, the [PDF] is
> orphaned, and the question mark winds up to the right of "EB OT TON RO".
>
> I believe that the correct approach to take is to treat the second bidi
> paragraph (i.e. "TO BE ... translation.") the same as:
>
> <div dir=ltr>
> <span dir=rtl>
> OR NOT TO BE?
> </span>
> -- hamlet, in rtl translation.
> </div>
>
> In other words, while the paragraph's overall level should be set according
> to the value of the ‘direction’ property of the containing block (ltr), it
> should be opened by repeating the embeddings or overrides introduced by the
> elements between the paragraph break and the containing block - in our
> example, the equivalent of an RLE (which is then matched by the</span>'s
> PDF equivalent).
>
> This is similar to the CSS specs for anonymous block
> boxes<http://www.w3.org/TR/2009/CR-CSS2-20090908/visuren.html#anonymous-block-level>,
> i.e:
>
> "When an inline box contains a block box, the inline box (and its inline
> ancestors within the same line box) are broken around the block. The line
> boxes before the break and after the break are enclosed in anonymous boxes,
> and the block box becomes a sibling of those anonymous boxes. When such an
> inline box is affected by relative positioning, the relative positioning
> also affects the block box."
>
> "The properties of anonymous boxes are inherited from the enclosing
> non-anonymous box".
>
> Does a line break does result in anonymous boxes? If not, we certainly need
> something in the Writing Modes spec. Actually, it would be good to have it
> either anyway, just to clarify things.
>
> 4. When the path from the containing block element to the paragraph break
> includes an element with unicode-bidi:isolate, there is no reason to go back
> all the way to the containing block element to get the new paragraph's base
> direction and the embeddings to be reconstituted at its start. Instead of
> referring to the containing block element, the spec should be referring to
> the closest unicode-bidi:isolate ancestor or containing block element,
> whichever is closer.
>
> Aharon
I believe the spec needs quite a few illustrations. If an author is
given a job where there are runs of LTR and RTL text and they only
understand one language, the spec as it is is not going to help.
I also believe that the spec should give particular examples of foreign
script of words that can easy be recognized. My use of ᠨᠶᠪᠧᠺᠴᡗ here
[2] does not help me. Only with research did I figure that is ran LTR,
1. <http://www.unicode.org/reports/tr9/#Reordering_Resolved_Levels>
2. <http://css-class.com/test/css/bidi/mongolian-test1-extra.htm>
--
Alan http://css-class.com/
Armies Cannot Stop An Idea Whose Time Has Come. - Victor Hugo